The 'hidden', 'invisible' or 'deep' web is simply the vast amount of online data that search engines cannot see. In fact, the vast bulk of web 'content' is posted in a form that search engines can't index. There are many reasons for this but, from the perspective of the journalist and online researcher, the most important is the fact that most databases can't be indexed by standard search engines because they are either in the wrong format or you can only access them via a query page.
The important issue here is to be aware that the 'hidden' web exists. There may be relevant databases and tools out there that could be crucial for your investigation. One example emerged recently on Help Me Investigate. One user wanted to know if there is any way to access information about the previous rulings made by the often controversial British QC - Justice David Eady. That kind of context can make or break and investigation and if you know where to look, and how to find information like that efficiently, then it can make all the difference.
Our work on this issue pin-pointed the British and Irish Legal Information Institute as the most likely source on historical data on a judges rulings. This BAILII database is a gold mine of information on British and Irish case law and legislation and is a good example of how publicly accessible (and free) databases can be so valuable. Not only does the source provide access to 76 databases and 200,000 searchable documents, it is maintained by charitable trust and is hosted by reliable parties.
Through BAILLI you can access a range of databases from England and Wales, Scotland, Northern Ireland, Ireland and Europe. You can search by subject, name and use advanced search options. After some careful experimentation with the 'multidatabase search' option we found it is possible to search for the references to a specific judge. Using the phrase search 'Justice Eady' and adding relevant date ranges we found it is possible to obtain all previous rulings although this search also includes cases that refer to statements by Justice Eady.
Once you have located a relevant database it is important to practise with the user interface as they are unlikely to be as intuitive as a normal search engine. Be precise and accurate.
The BAILLI database demonstrates the importance identifying and careful use of hidden web resources and this search strategy can be applied more often than most people would imagine. It is, for example, notoriously difficult to pin down information about people using normal search engines. Part of the reason for this is that information about people is often tied up in databases: professional registers; census data; records on births, deaths and marriages; and, occupational databases. To solve this problem dedicated tools have emerged that take you deeper and further than any standard search engine can. For example pipl.com targets 'hidden' web content specifically and 192.com is the UK's most important tool for accessing information about individuals - through UK's electoral roll.
Try to get used to looking for hidden web content routinely. Imagine you are doing a story on road accidents in a UK region. You decide that historical data would be useful background. In fact, it is crucial to put your story into context. One tactic is to simply search for a relevant database using a standard search engine. If you put the simple query 'road accident database' into Google (UK pages) you get the National Statistics pages on road accidents as the top hit and other hits include other relevant local and regional databases. Use the same technique to pin-point other searchable databases such as 'hazardous substances database', 'register of doctors', or 'flight accident database'.
Other tools for accessing the hidden web are detailed below and the 'further reading' section in this Wikipedia page is useful.
Dedicated tools:
Directories of hidden web content:
The Librarian's Index to the Internet
Add a comment
Log in to leave a comment » All updates» Updates RSS