Monday, July 30, 2007

DEEP Web


For this blog entry I thought I would share some useful information with my classmates. One of my professors taught my class about the DEEP Web and the many benefits that it holds. So you might be asking yourself:

What is the DEEP Web?
The DEEP Web contains web resources that lie below the surface web.
* Surface web = web that is indexed by search engines and subject directories
* Most of DEEP Web is in searchable databases

DEEP web Stats:
* Deep Web contains 7,500 terabytes of information (19 terabytes of information in the surface web)
* Deep Web contains nearly 550 billion individual documents (1 billion in the surface web)
* A study at the NEC Research Institute (2001)published that Internet searchers are therefore searching only 0.03% of the pages available to them today.

When to use the DEEP Web:
* Directories and Portals when you:
- have a broad topic
- want selected, evaluated, and annotated collections
- prefer quality over quantity
* Invisible or Deep Web (search sites and databases) when you:
- are looking for information that is likely in a database
- are looking for information that dynamically changes in content
* Search engines (general and specialized) when you:
- have a narrow topic
- want to take advantage of the newer retrieval technologies

Here are some "Hidden" databases that you as educators or future educators might find useful when teaching:

* Educator's Reference Desk
* Nature Serve Explorer
* FindArticles
* Complete Planet

3 comments:

DRS said...

This is actually a very important concept that many don't know about or utilize. I had a discussion with Joyce Valenza who works outside of Philadelphia as a teacher librarian and although she called it the invisible web was referring to the same thing. Her concern is the people don't use it because it's largely not indexed by google. Because many of these databases are owned by companies and only released to the public via educational institutions who pay big dollars for it, many resources are unused and perhaps unusable. Add to to that the fact that even if they were accessible to google, historical, primary documents might not be searchable.
For example, if students went to search for newspaper articles from the 1940's and looked up Holocaust, they would find nothing. Why? Because it wasn't called the holocaust in 1940. Same with terms like "African Americans", "World War II", etc. The trouble is how will we index or tag these resource. The job will be great.
Joyce told me about something called Federated Search. It may resolve some of the issues.
http://en.wikipedia.org/wiki/Federated_search

This is certainly a topic I know little about but need to learn more. Thanks for bringing it forward.

Anne Davis said...

Hi Stephanie,

This is a timely topic and I enjoyed your post. Thanks for providing the links to the hidden databases.

I also enjoyed your PhotoStory. You are so right about deciding on the topic. I have the same problem when I am naming my class blogs for student use. It takes forever for me to land on just the right one!

Blogging is a great way to learn and make connections. Keep up the good work!

Best,
Anne
http://anne.teachesme.com

Jane said...

Hey Stephanie, thanks for the info...something to definitely learn more about!