CEHv6 Module 04 Google Hacking pdf pdf

  

Ethical Hacking and

Counterm easures Counterm easures Version 6

  M o d u le IV Google Hacking Module Objective

This m odule will fam iliarize you with:

  • What is Google Hacking • What a Hacker Can Do With Vulnerable Site • Google Hacking Basics G l H ki B i
  • Google Advanced Operators • Pre-Assessm ent
  • Locating Exploits and Finding Targets g p g g
  • Tracking Down Web Servers, Login Portals, and Network Hardware • Google Hacking Tools

Module Flow

  Google Hacking Pre-Assessm ent

  What a Hacker Can Do Locating Exploits and Finding Targets

  With Vulnerable Site Tracking Down Web Servers,

  Google Hacking Basics Login Portals, and Network Hardware

  Google Advanced Operators Google Hacking Tools

What is Google Hacking

  Google hacking is a term that refers to the art of creating com plex search engine queries in order to filter through large p g q g g am ounts of search results for inform ation related to com puter security

  In its m alicious form at, it can be used to detect websites that In its m alicious form at it can be used to detect websites that are vulnerable to num erous exploits and vulnerabilities as well as locate private, sensitive inform ation about others, such as credit card num bers, social security num bers, and passwords

  Google Hacking involves using Google operators to locate specific strings of text within search results p g

  What a Hacker Can Do With Vulnerable Site Vulnerable Site

Inform ation that the Google Hacking Database identifies: g g

  Advisories and server vulnerabilities Error m essages that contain too m uch inform ation Files containing passwords Files containing passwords Sensitive directories Pages containing logon portals Pages containing network or vulnerability data such as firewall Pages containing network or vulnerability data such as firewall logs

  

Google Hacking Basics Google Hacking Basics

Anonym ity with Caches

  Hackers can get a copy sensitive data even if plug on that pesky Web server is pulled off and they can crawl into entire website without even sending a single packet to server they can crawl into entire website without even sending a single packet to server If the web server does not get so m uch as a packet, it can not write any thing to log files

Using Google as a Proxy Server

  Google som e tim es works as a proxy server which requires a Google translated URL and som e m inor URL m odification translated URL and som e m inor URL m odification Translation URL is generated through Google’s translation service, located at www.google.com / translate_ t service located at www google com / translate t

  If URL is entered in to “Translate a web page” field, by selecting a language pair and clicking on Translate button, Google will language pair and clicking on Translate button Google will translate contents of Web page and generate a translation URL

Directory Listings

  A directory listing is a type of Web page that lists files and directories that exist on a Web server server It is designed such that it is to be navigated by clicking directory links, directory listings typically have a title that describes the current directory, a list of files and directories that can be clicked J ust like an FTP server, directory listings offer a no-frills, easy-install solution for granting access to files that can be stored in categorized folders access to files that can be stored in categorized folders Problem s faced by directory listings are:

  • • They do not prevent users from downloading certain files or accessing certain directories hence they are not secure

  • They can display inform ation that helps an attacker learn specific technical details about Web server
  • • They do not discrim inate between files that are m eant to be public and those that are m eant to rem ain behind the

    scenes
  • They are often displayed accidentally, since m any Web servers display a directory listing if a top-level index file is m issing or invalid

  

Directory Listings(cont’d)

  Locating Directory Listings Since directory listings offer parent directory links and allow y g p y browsing through files and folders, attacker can find sensitive data sim ply by locating listings and browsing through them Locating directory listings with Google is fairly straightforward as they begin with phrase “Index of,” which shows in tittle An obvious query to find this type of page m ight be ntitle:index.of, which can find pages with the term “index of” in the title of the docum ent

intitle:index.of “parent directory ” or intitle:index.of “nam e

size” queries indeed provide directory listings by not only f focusing on index.of in title but on keywords often found inside d f l b k d f f d d directory listings, such as parent directory, nam e, and size

  Locating Directory Listings (cont d) (cont’d)

  Finding Specific Directories

  This is easily accom plished by adding the nam e of the directory to the search query

  To locate adm in directories that are To locate “adm in” directories that are accessible from directory listings, queries such as intitle:index.of.adm in or

  intitle:index.of inurl:adm in will work

  well, as shown in the following figure

Finding Specific Files

  As the directory listing is in tree style, it is also possible to find specific files in a directory listing To find WS_ FTP log files, try a search such as intitle:index.of w s_ ftp.log, as shown in the Figure below:

Server Versioning

  The inform ation an attacker can use to determ ine the best m ethod for attacking a Web server is the exact software version Web server is the exact software version An attacker can retrieve that inform ation by connecting directly to the Web port of that server and issuing a request for the HTTP headers Som e typical directory listings provide the nam e of the server software as well as the version num ber at the bottom portion. These inform ation are faked and attack can be done on web server

  intitle:index.of “ server at” query will locate all directory listings on the Web with index of in the title and server at any w here in the text of the page

  In addition to identifying the Web server version, it is also possible to determ ine the operating system of the server as well as m odules and other software that is installed Server versioning technique can be extended by including m ore details in the query

  

Server Versioning (cont’d)

Going Out on a Lim b: Traversal Techniques Techniques

  

Attackers use traversal techniques to expand a sm all foothold into a larger

co p o com prom ise se The query intitle:index.of inurl:“/ adm in/ *” is helped to traversal as shown in the figure:

Directory Traversal

  By clicking on the parent directory link the sub links under y g p y it will open. This is basic directory traversal Regardless of walking through the directory tree , traversing outside the Google search wandering around on traversing outside the Google search wandering around on the target Web server is also be done

  The word in the URL will be changed with other words Th d i th URL ill b h d ith th d Poorly coded third-party software product installed in the server accepts directory nam es as argum ents which allows t di t t hi h ll users to view files above the web server directory

  Autom ated tools can do a m uch better job of locating files Autom ated tools can do a m uch better job of locating files and vulnerabilities Increm ental Substitution This technique involves replacing num bers in a URL in an attem pt to This technique involves replacing num bers in a URL in an attem pt to find directories or files that are hidden, or unlinked from other pages By changing the num bers in the file nam es, the other files can be found In som e exam ples, substitution is used to m odify the num bers in the

URL to locate other files or directories that exist on the site URL to locate other files or directories that exist on the site

  • / docs/ bulletin/ 2.xls could be m odified to / docs/ bulletin/ 2.xls
  • / DigLib_ thum bnail/ spm g/ hel/ 0 0 0 1/ H/ could be changed to / Di Lib th / DigLib_ thum bnail/ spm g/ hel/ 0 0 0 2/ H/ b il/ / h l/ / H/
  • / gallery/ wel0 0 8-1.jpg could be m odified to / gallery/ wel0 0 8-2.jpg

Extension Walking

  File extensions and how filetype operator can be used to locate files with specific file extensions i HTM files can be easily searched with a query such as filetype:HTM HTM Filetype searches require a search param eter and files ending in HTM always have HTM in the URL After locating HTM files, substitution technique is used to find files with the sam e file nam e and different extension E i Easiest way to determ ine nam es of backup files on a server is to locate a directory listing d i f b k fil i l di li i using intitle:index.of or to search for specific files with queries such as intitle:index.of

  index.php.bak or inurl:index.php.bak

  If a system adm inistrator or Web authoring program creates backup files with a .BAK y g p g p extension in one directory, there is a good chance that BAK files will exist in other directories as well

  Google Advanced Operators Google Advanced Operators

Site Operator

  The site operator is absolutely invaluable during the p y g inform ation-gathering phase of an assessm ent Site search can be used to gather inform ation about the servers g and hosts that a target hosts Using sim ple reduction techniques, you can quickly get an idea Using sim ple reduction techniques, you can quickly get an idea about a target’s online presence Consider the sim ple exam ple of site:washingtonpost.com Consider the sim ple exam ple of site:washingtonpost.com – site:www.washingtonpost.com This query effectively locates pages on the washingtonpost.com This query effectively locates pages on the washingtonpost com dom ain other than www.washingtonpost.com

  Site Operator (cont’d) intitle:index.of intitle:index.of is the universal search for directory listings

  In m ost cases, this search applies only to Apache-based servers, but due to the overwhelm ing num ber of Apache- h l i b f A h derived Web servers on the Internet, there is a good chance that the server you are profiling will be Apache-based

  Screenshot error | warning

  Error m essages can reveal a great deal of inform ation about a target Oft Often overlooked, error m essages can provide insight into the application l k d id i i ht i t th li ti or operating system software a target is running, the architecture of the network the target is on, inform ation about users on the system , and m uch m ore Not only are error m essages inform ative, they are prolific A query of intitle: error results in over 55 m illion results error | warning (cont’d)

login | logon

  Login portals can reveal the software and operating system of a target, and in m any cases self help docum entation is linked from the m ain and in m any cases “self-help” docum entation is linked from the m ain page of a login portal These docum ents are designed to assist users who run into problem s g p during the login process

  Whether the user has forgotten his or her password or even usernam e, Whether the user has forgotten his or her password or even usernam e, this docum ent can provide clues that m ight help an attacker Docum entation linked from login portals lists e-m ail addresses, phone num bers, or URLs of hum an assistants who can help a troubled user b f h i h h l bl d regain lost access These assistants, or help desk operators, are perfect targets for a social These assistants or help desk operators are perfect targets for a social engineering attack login | logon (cont’d)

usernam e | userid | em ployee.ID | “your usernam e is” y

There are m any different ways to obtain a usernam e from a target system

Even though a usernam e is the less im portant half of m ost authentication

m echanism s, it should at least be m arginally protected from outsiders

password | passcode | “your p password is”

  The word password is so com m on on the Internet, there are over The word password is so com m on on the Internet, there are over 73 m illion results for this one-word query During an assessm ent, it is very likely that results for this query com bined with a site operator will include pages that provide help to users who have forgotten their passwords In som e cases, this query will locate pages that provide policy inform ation about the creation of a password This type of inform ation can be used in an intelligent-guessing or even a brute-force cam paign against a password field b t f i i t d fi ld password | passcode | “your p password is” (cont’d) ( )

adm in | adm inistrator

  The word adm inistrator is often used to describe the person in control of a network or system k The word adm inistrator can also be used to locate adm inistrative login pages, or login portals The phrase Contact your system adm inistrator is a fairly com m on phrase on p y y y p the Web, as are several basic derivations A query such as “please contact your * adm inistrator” will return results that reference local, com pany, site, departm ent, server, system , network, database, f l l it d t t t t k d t b e-m ail, and even tennis adm inistrators If a Web user is said to contact an adm inistrator, chances are that the data If a Web user is said to contact an adm inistrator, chances are that the data has at least m oderate im portance to a security tester adm in | adm inistrator (cont’d)

adm in login adm in login Reveals Adm inistrative Login Pages

  • – ext:htm l – ext:htm
  • – ext:shtm l – ext:asp – ext:php p p p

  The – ext:htm l – ext:htm – ext:shtm l – ext:asp – h f h fil ext:php query uses ext, a synonym for the filetype operator, and is a negative query

  It returns no results when used alone and should be com bined with a site operator to work properly The idea behind this query is to exclude som e of the m ost com m on Internet file types in an attem pt to find files that m ight be m ore interesting

  • – ext:htm l – ext:htm – ext:shtm l – ext:asp ext:php (cont d) ext:asp – ext:php (cont’d)
inurl:tem p | inurl:tm p | inurl:backup | inurl:bak p | The inurl:tem p | inurl:tm p | inurl:backup | inurl:bak query , com bined w ith the site operator, searches for tem porary or backup files or ith th it t h f t b k fil directories on a server Although there are m any possible nam ing conventions for tem porary or backup files, this search focuses on the m ost com m on term s Since this search uses the inurl operator, it w ill also locate files that contain these term s as file extensions, such as index.htm l.bak f ,

  

Pre-Assessm ent Pre-Assessm ent intranet | help.desk

  The term intranet, despite m ore specific technical m eanings, has becom e a generic term that describes a network confined to a sm all group

  In m ost cases, the term intranet describes a closed or private network unavailable to the general public Many sites have configured portals that allow access to an y g p intranet from the Internet, bringing this typically closed network one step closer to the potential attackers

  Unavailable to public Locating Exploits and g p

Finding Targets

Locating Public Exploit Sites

  One way to locate exploit code is to focus on the file extension of the source code and then search for specific content within that code search for specific content within that code Since source code is the text-based representation of the difficult-to-read m achine code, Google is well suited for this task For exam ple, a large num ber of exploits are written in C, which generally use source code ending in a .c extension A A query for filety pe:c exploit returns around 5,0 0 0 results, m ost of w hich are exactly the f fil t l it t d lt t f hi h tl th types of program s you are looking for These are the m ost popular sites hosting C source code containing the word exploit, the returned list is a good start for a list of bookm arks t d li t i d t t f li t f b k k Using page-scraping techniques, you can isolate these sites by running a UNIX com m and against the dum ped Google results page

  grep Cached exp | awk –F" –" '{print $1}' | sort –u

Locating Exploits Via Com m on Code Strings g

  Another way to locate exploit code is to focus on com m on strings within y p g the source code itself

  O One way to do this is to focus on com m on inclusions or header file d hi i f i l i h d fil references For exam ple, m any C program s include the standard input/ output library functions, which are referenced by an include statem ent such as # include <stdio.h> within the source code A query like this would locate C source code that contained the word exploit, regardless of the file’s extension:

  • “#include <stdio.h>” exploit

  Searching for Exploit Code with Nonstandard Extensions

  Locating Source Code with Com m on Strings g

Locating Vulnerable Targets

  In fact, it’s not uncom m on for Attackers are increasingly using public vulnerability Google to locate Web based Google to locate Web-based announcem ents to contain i targets vulnerable to specific

  Google links to potentially exploits vulnerable targets

Locating Targets Via Dem onstration Pages Pages

  Develop a query string to locate vulnerable targets on the Web; the vendor’s Web site is a good place to discover what exactly the product s Web pages look like site is a good place to discover what exactly the product’s Web pages look like For exam ple, som e adm inistrators m ight m odify the form at of a vendor-supplied Web page to fit the them e of the site These types of m odifications can im pact the effectiveness of a Google search that targets a vendor-supplied page form at You can find that m ost sites look very sim ilar and that nearly every site has a “powered by” m essage at the bottom of the m ain page

  Powered by” Tags Are Com m on Query “ Fodder for Finding Web Applications Fodder for Finding Web Applications

Locating Targets Via Source Code

  A hacker m ight use the source code of a program to discover ways to g p g y search for that software with Google

  To find the best search string to locate potentially vulnerable targets, you g p y g , y can visit the Web page of the software vendor to find the source code of the offending software In cases where source code is not available, an attacker m ight opt to In cases where source code is not available an attacker m ight opt to sim ply download the offending software and run it on a m achine he controls to get ideas for potential searches

  Vulnerable Web Application Exam ples Vulnerable Web Application Exam ples (cont’d) p ( )

Locating Targets Via CGI Scanning

  One of the oldest and m ost fam iliar techniques for locating vulnerable Web servers is through the use of a CGI scanner These program s parse a list of known “bad” or vulnerable Web files and attem pt to locate those files on a Web server Based on various response codes, the scanner could detect the presence of these potentially vulnerable files l bl f l A CGI scanner can list vulnerable files and directories in a data file, such as:

  A Single CGI Scan-Style Query Exam ple: search for inurl:/ cgi-bin/ userreg.cgi

  Tracking Down Web g Servers, Login Portals, and

Network Hardware Network Hardware Finding IIS 5.0 Servers Query for “Microsoft-IIS/ 5.0 server at”

Web Server Software Error Messages g

  Error m essages contain a lot of useful inform ation, but in the context of locating specific servers, you can use portions of various error m essages to locate servers running specific software versions f i The best way to find error m essages is to figure out what m essages the server is capable of generating You could gather these m essages by exam ining the server source code or configuration files or by actually generating the errors on the server yourself The best way to get this inform ation from IIS is by exam ining the source code of the error pages them selves

  IIS 5 and 6, by default, display static HTTP/ 1.1 error m essages when the server encounters som e sort of problem These error pages are stored by default in the %SYSTEMROOT%\ help\ iisHelp\ com m on Th d b d f l i h %SYSTEMROOT%\ h l \ ii H l \ directory

  Web Server Software Error Messages (cont’d) ( ) A query such as intitle:”The page cannot be found” “please follow ing” “Internet * Services” can be used to search for IIS servers that present a p 40 0 error

  IIS HTTP/ 1.1 Error Page Titles

  IIS HTTP/ 1.1 Error Page Titles (cont d) (cont’d)

  “Object Not Found” Error Message Used to Find IIS 5.0

  5 Apache Web Server Apache Web servers can also be located by focusing on server-generated error m essages Som e generic searches such as “Apache/ 1.3.27 Server at” -intitle:index.of intitle:inf” or “Apache/ 1.3.27 Server at” -intitle:index.of intitle:error

  Apache 2.0 Error Pages

  Application Software Error Messages Messages Although this ASP m essage is fairly benign , som e ASP Although this ASP m essage is fairly benign som e ASP error m essages are m uch m ore revealing Consider the query “ASP.N ET_ SessionId”“data source=”, which locates unique strings found in ASP.NET application state dum ps

  Er r o r These dum ps reveal all sorts of inform ation about the running application and the Web server that hosts that app cat o application An advanced attacker can use encrypted password data and variable inform ation in these stack traces to subvert the security of the application and perhaps the Web h f h l d h h b server itself

  ASP Dum ps Provide Dangerous Details Details

  Many Errors Reveal Pathnam es and Filenam es and Filenam es

  CGI Environm ent Listings Reveal Lots of Inform ation

Default Pages

  Another way to locate specific types of servers or Web software is to search for default Web pages ft i t h f d f lt W b Most Web software, including the Web server software itself, ships with one or m ore default or test pages

  These pages can m ake it easy for a site adm inistrator to These pages can m ake it easy for a site adm inistrator to test the installation of a Web server or application Google crawls a Web server while it is in its earliest stages Google crawls a Web server while it is in its earliest stages of installation, still displaying a set of default pages

  In these cases there is generally a short window of tim e In these cases there is generally a short window of tim e between the m om ent when Google crawls the site and when the intended content is actually placed on the server A Typical Apache Default Web Page

  Locating Default Installations of IIS 4.0 on Windows NT 4.0 / OP /

  Default Pages Query for Web Server Many different types of Web server can be located by querying for default pages as well default pages as well Outlook Web Access Default Portal Query allinurl:”exchange/ logon.asp”

Searching for Passwords

  Password data, one of the “Holy Grails” during a penetration test, should be protected p

  Unfortunately, m any Unfortunately, m any exam ples of Google queries can be used to locate passwords on the Web

  Windows Registry Entries Can Reveal Passwords Query like filety pe:reg intext: “internet account m anager” could reveal interesting keys containing password data reveal interesting keys containing password data

  Usernam es, Cleartext Passwords, and Hostnam es! Search for password inform ation, intext:(passw ord | Search for password inform ation intext:(passw ord | passcode | pass) intext:(usernam e | userid | user), com bines com m on w ords for passw ords and user IDs into one query

  Google Hacking Tools l ki l News

  

Google Hacking Database

(GHDB) (GHDB) The Google Hacking Database (GHDB) contains queries that identify sensitive data such as portal logon pages, logs with network security

  Visit http:/ / johnny.ihackstuff.com p g p g , g y inform ation, and so on

  

Google Hacking Database

(GHDB) (GHDB)

SiteDigger Tool

  SiteDigger searches Google’s cache to look for vulnerabilities, errors, configuration issues, proprietary inform ation, and interesting security nuggets configuration issues proprietary inform ation and interesting security nuggets on websites

  Gooscan johnny.ihackstuff.com johnny.ihackstuff.com

  Gooscan is a tool that autom ates queries against Google search Gooscan is a tool that autom ates queries against Google search appliances

But it can be run against Google itself in direct violation of their Term s

of Service For the security professional, gooscan serves as a front end for an external server assessm ent and aids in the inform ation-gathering phase of a vulnerability assessm ent For the web server adm inistrator, gooscan helps discover what the web com m unity m ay already know about a site thanks to Google's search appliance appliance

Goolink Scanner

  It rem oves the cache inform ation from your searches and your searches and only collects and displays the links This is very handy for finding vulnerable sites vulnerable sites wide open to google and googlebots

  Goolag Scanner Goolag Scanner enables everyone to audit his/ her own web site via Google It uses one xm l-based configuration file for its settings

  Tool: Google Hacks code.google.com / p/ googlehacks/ code google com / p/ googlehacks/

  

Google Hacks is a com pilation of carefully crafted Google l k i il i f f ll f d l

searches that expose novel functionality from Google's

search and m ap services

You can use it to view a tim eline of your search results,

view a m ap, search for m usic, search for books, and view a m ap, search for m usic, search for books, and perform m any other specific kinds of searches You can also use this program to use google as a proxy

  

Google Hacks: Screenshot

  

Google Hacks: Screenshot

Google Hack Honeypot

  Google Hack Honeypot is the reaction to a new type of m alicious web traffic: search engine hackers

It is designed to provide reconnaissance against attackers that use

search engines as a hacking tool against resources

  Google Hack Honeypot: Screenshot Screenshot

Tool: Google Protocol

  Google Protocol is a little app that when installed, Google Protocol is a little app that when installed,

registers two extra protocols sim ilar to the http: and the

ftp: protocols under windows, nam ely google: and lucky:

Urls starting with the ‘google:’ refer to the corresponding google search Urls starting with the ‘lucky:’ refer to the top Google result l

Google Cartography

  

Google Cartography uses the Google API to find web pages referring Google Cartography uses the Google API to find web pages referring

to street nam es

Initial street and region criteria are com bined to form a search query,

which is then executed by the Google API Each URL from the Google results is fetched and the content of the pages converted into text The text is then processed using regular expressions designed to capture inform ation relating to the relationship between streets capture inform ation relating to the relationship between streets

  Google Cartography: Screenshot

  Sum m ary In this m odule, Google hacking techniques have been reviewed

The following Google hacking techniques have been discussed: discussed: