FYI@Librarian: What Is the Deep Web?

What Is the Deep Web?

The Deep Web describes the hidden parts of the internet that can’t be found using search engines such as Google. There’s a vast amount of data on the Deep Web, including some perfectly everyday stuff, such as the contents of your own webmail inbox. The Deep Web is victim to a lot of misconception, though, and is frequently confused with the Dark Web, and its own associated illegal activities.

There’s far more data hidden – around 500 times more – on the Deep Web than what’s on the standard “Surface” Web. But, unlike the latter, this data isn’t indexed by search engines, so can’t be found by typing a few phrases into Google.

Don’t worry, it’s not as scary as it sounds. We explain everything you need to know about the Deep Web: What it is, how it works, how it’s different to the Dark Web, and what everyone gets wrong about it.

The Deep Web Explained

You can’t get to every single webpage on the internet through a search engine. No, not even with Google. A search engine works by tracking links: It finds pages by tracking which ones have been linked to from another webpage and prioritizes the ones that have been linked to the most often.

By “crawling” from page to page, a search engine can create an index of all the pages that are linked to one another. But, if a page or the information on that page has never been linked to by another page, the search engine can’t find it. It is the Deep Web.

Placing web content on the Deep Web is actually fairly easy.

You can opt out of being indexed by Google’s search engines with just a few lines of code.
You can unlist a YouTube video, so that only those with the link can find it.
You can ensure your Tumblr blog isn’t indexed by clicking a button on the settings page.
You can make your Facebook page private to keep search engines from crawling it as well.

None of this means that people can’t find the information you post. It just means they won’t be able to get to it from search engines.

The Deep Web vs. The Dark Web

“If you’re on the Deep Web, you’re exploring the wild west of the internet. If you move into the Dark Web, you’re choosing to walk into the bandit’s den on the far outskirts of town.”

The Dark Web is effectively a subset of the Deep Web. Although the two terms are sometimes used interchangeably, this isn’t correct.

While any online information that hasn’t been indexed by search engines is part of the Deep Web, only the online information that is intentionally hidden from search engines qualifies as the Dark Web.

Those using the Dark Web are taking advantage of the secrecy in order to buy items or share information that they don’t want anyone knowing they’re buying or sharing: Weapons, drugs, and sex are all included.

Think of it this way: If you’re on the Deep Web, you’re exploring the wild west of the internet. If you move into the Dark Web, you’re choosing to walk into the bandit’s den in the outskirts of town.

Not So Hidden, After All

Since it’s designed to be hidden, the Dark Web is actually easier to identify than the disparate elements that make up the Deep Web: Dark Web users often host their sites on the “.onion” domain rather than the “.com” one, as this domain is only accessible through a Tor browser (Tor is short for “The onion router”).

Those who strongly value their anonymity might use Tor even if their activities are legal, as companies can’t track the location or network of a Tor user. The NSA likely keeps a database on anyone who has downloaded Tor, however, so the Dark Web might not be as secure as its inhabitants hoped for.

Unlike the Deep Web, the Dark Web is fairly small, and likely just a 30 or 40 thousand sites — more criminal activity happens on the rest of the internet than on the Dark Web, simply because the regular internet has billions of users.

How the Deep Web Works

“Thousands of databases on the internet — all of which are free to access — can’t be found by Google or Bing.”

There are a few different reasons why a page or the data it holds might not be indexed by a search engine.

It intentionally doesn’t use a http:// or https:// protocol, opting for a non-standard protocol, such as a .onion domain.
It is password-protected — this includes subscription services, private social media accounts, and everyone’s Gmail inbox.
It is hidden behind a CAPTCHA or similar machine-input filter.
It simply hasn’t been linked to by any other online page.
The data is only accessible through a database query.

Deep Web Databases

The last possibility is why the Deep Web holds so much potential: The raw, unlinked information found in online databases is part of the Deep Web. Visitors must type a keyword into a search function to find the information they want, and the page’s hyperlink remains the same even while its portal delivers the right results.

In other words, thousands of databases on the internet — all of which are free to access — can’t be found by Google or Bing.

The Social Security Administration’s baby name database is an example of information that can be searched within the portal, but isn’t available on its own link. As a result, you can easily learn that 8,705 U.S. babies were named “William” in 1888, but Google can’t index this fact.

Useful databases that commonly escape search engines’ notice include census data, patient lists, insurance information, and scientific papers.

Misconceptions About the Deep Web

“Have a Gmail, Hotmail or Outlook.com email address? Then you’re already using the Deep Web.”

The name “Deep Web” makes the topic sound intriguing. After all, it’s easy to fear the unknown. People assume that if a search engine can’t find information on the internet, it must be important and could be dangerous.

Misconception #1: The Deep Web is Illegal

No, you’re thinking of the Dark Web – the Deep Web is far more humdrum. In reality, the Deep Web is just the detritus of the internet – the stuff that would render Google Search unusable if it turned up in the listings.

Sure, the Deep Web includes the Dark Web, but it’s a lot larger and more boring than the Dark Web.

Misconception #2: The Deep Web is for Experts Only

Far from it. Do you have a Gmail, Hotmail or Outlook.com email address? Then you’re already using the Deep Web.

What does this mean in practice? Let’s say you want to get to your inbox. You’d start by heading to Gmail and logging in securely with your address and password. What you wouldn’t do is search “John Doe’s inbox” on Google and click a link to get there.

That’s a prime example of Deep Web in practise. See? No computer science degree needed.

Misconception #3: The Deep Web Can’t Be Searched

The final misconception many have about the Deep Web? That it can’t be searched at all.

The Deep Web has also been called the “Invisible Web,” but this term is misleading. Pages on the Deep Web aren’t impossible to find. In short, being on the Deep Web won’t hide every trace of you online.

There most certainly are ways of searching the content that’s on the Deep Web, in some cases. But it’s not the kind of activity that most people will be interested in.

How You Can Use the Deep Web

The causal internet user may never know how much they’re missing. But a vast database of information is available to anyone who understands how to venture below the surface of the internet.

In reality, almost all of the information is either too specialized for the general public to care about it, or simply private information — your email inbox, for example, isn’t meant to be available to search engines (and rightly so).

If you’d like to keep your personal information secure while operating online, you can download a Tor browser.

But really, any time you password-protect your online information or access your emails via Gmail or Outlook.com, you’re already taking advantage of the Deep Web.

Read up on the best password practices or consider learning more about VPNs to continue keeping your accounts secure or anonymous.

How to Search the Deep Web

“Think of this process like rowing a boat across the surface of a lake until you reach the right spot to sink your fishing line. You can’t search the entire depths of the lake, but you still got the fish you wanted.”

The Deep Web holds a lot more information that the surface layer of the internet. In 2001, experts estimated it might be between 400 and 550 times larger, though it can’t be definitively measured. Since much of that information is tucked away in databases, it represents a deep bench of valuable information that is largely wasted.

Luckily, if you want to find information on the Deep Web, you have a few options.

People Search Engines

You can try looking up individuals through a people search engine — such as Pipl or MyLife — as these engines’ robots directly access and extract data from searchable databases that typical search engines can’t index.

Full-Text Search Engines

If you’re hoping to access the academic side of the Deep Web, full-text search engines such as JSTOR, the Library of Congress, or Google Scholar can search otherwise-isolated databases of books and articles.

Search One Layer Up

When in doubt, try Googling general terms describing the type of information you need. I know, I know – I just told you that the Deep Web can’t be accessed through Google. However, if the information is located in a searchable database, you just need to locate that database.

For instance, you’ll never find out what the exact volume and temperature of the rivers in your county was four days ago with a search engine. But, if you Google “Current Water Data for the Nation,” you’ll find a regularly updated database run by the U.S. Geological Survey, and searching it will turn up the results you need.

Think of this process like rowing a boat across the surface of a lake until you reach the right spot to sink your fishing line. You can’t search the entire depths of the lake, but you still got the fish you wanted.

Source | https://tech.co/what-is-the-deep-web-2018-05

Regards

Mr. Pralhad Jadhav

Master of Library & Information Science (NET Qualified)

Senior Manager @ Knowledge Repository

Khaitan & Co

Blog | http://pralhad-fyilibrarian.blogspot.in/

Website | https://sites.google.com/site/pralhadjadhavlib/home

Twitter Handle | @Pralhad161978

Mobile @ 9665911593