Searching the Internet
Most of you will be familiar with the search engine Google. Search engines are only one tool you can use to find websites – subject gateways and directories are others.
Search Engines
Create an index of websites with software called `spiders' or `web crawlers'. They randomly select web pages to include in the index. They generally cannot index information held in databases.
Advantages and disadvantages of using search engines
ADVANTAGES |
DISADVANTAGES |
| Indexes for search engines can be very large because they are created by software. | Sites are randomly crawled and indexed for keywords by software. Not even the largest search engine can crawl all of the web, it is too vast. Results are not evaluated for quality. |
| They are very simple to search. | Searching can result in far too many results to check properly. Results may not be in the most relevant order.More authorative sources maybe buried due to lack of popularity. |
Subject Gateways and Directories
Subject Gateways: Individuals and organisations have created what are called “subject gateways”.
These may be created using web 2.0 tools such as wikis e.g Quantiki and del.io.us tagging e.g architecture
The National Library of Australia created a wide range of subject guides, listing authorative sources. Try this one for health. Many of the databases listed are available from CDU Library.
Advantages and disadvantages of subject directories
ADVANTAGES |
DISADVANTAGES |
Web sites are selected and evaluated by people who may be experts, or may simply be enthusiastic, about a subject. |
Usually small collections due to the fact that the process of selecting and evaluating sites is labour intensive - and may be subjective and biased. Often are not kept up to date and may not contain all of the relevant sites on a topic. |
Can be browsed from a main subject area, making it easier for people who may be unfamiliar with a topic. |
It may be difficult to work out which main subject will contain the topic you want. Sites may be classified by different subjects in different directories. There is no standard subject groupings. |
The invisible web?
refers to sources that are not directly available by using a search engine. This doesn't always mean the information is unavailable.
Some examples include:
- sources only available via subscription eg. academic databases or many online journals.
information in databases located on a website. To find a phone number from the White pages, you need to go to the site first and then search.
While the information is readily available if you don't know about the White pages, then you wouldn't know where to search. This is why they still advertise on TV.
Advantages and disadvantages of using the invisible web
ADVANTAGES |
DISADVANTAGES |
| Invisible web resources specialise in particular subject areas, therefore you will get more comprehensive and relevant results. | Not all invisible web resources may be listed on an invisible web directory, so you may still not be finding relevant resources. Tip - use a search engine to find invisible web resources by searching for general keywords with the term 'database', or use a subject directory as many resources they index are from the invisible web. |
| Search interfaces on invisible web resources are designed for the type of search you are doing. eg searching the phonebook will find phone numbers more successfully than a search engine. |
Can be difficult to know which invisible web resource to select from an invisible web directory. |
| Invisible web resources are often created by organisations and institutions that are authorities on the subjects they cover, therefore are often more reliable sources. | Information on the invisible web needs to be evaluated as carefully as not all invisible web sources are credible. |
| A lot of information that is on the invisible web cannot be found elsewhere on the web. | Information may change quickly and become unavailable, or may become part of the visible web. |
Source: Sherman, Chris and Price, Gary (2001) The Invisible Web: uncovering information sources the search engines can't see, Information Today Inc, New Jersey