Great job finding this page. I hope you've learnt something new in the previous exercise.
Made with love by Lord_Idiot
Now, one basic skill in web hacking is reconnaissance. You may know this term in the context of wars or battles, where recon involves scouting or similar activities to gain information about the enemy forces or surrounding terrain.
Web reconnaissance is a similar concept. Before you are ready to launch your next big hack on Facebook or Google or any other website/web service, you need to understand which areas to look into to find useful information for performing your attack. This is the essence of web reconnaissance in the context of web hacking.
It might not seem too useful to perform reconnaissance on a website, what can we even gain from such activities?
Reconnaissance of websites/web services allow us to discover more features of the server that we aim to attack. The more features we are aware of, the more possible attack paths we could try to plan, and the more areas we could look at to find bugs/vulnerabilites. To provide an analogy, if we were to aim to rob a bank, we should somehow perform recon on the bank itself, to figure out where the doors are, how many security guards there are, etc. Only with such information will we be able to craft a heisting plan.
Often, web developers make mistakes in configuration and may allow certain pages to be seen by unauthorised users. For example, web blogging websites may have admin pages for administration of the blog posts. If these were misconfigured, perhaps a normal user could view it and perform administrative actions on the website without authorisation. However, a normal user may not even know this page exists, reconnaissance helps to find such secret pages that are usually not exposed to the user.
The benefits of web reconnaissance are plenty, and listing them all will make this already length challenge even more lengthy. If this topic interests you, you may wish to learn more from experts in this field like NahamSec!
There are various ways to perform web reconnaissance, and in fact, viewing the HTML source that you performed in the previous section is also one possible way to conduct web reconnaissance. The only way you found this secret page way because you were able to find the HTML comment left behind by a developer! You'll be surprised, but such comments really exist in the real world, and can greatly help with recon.
The technique I will showcase in this section will be simple, common file scanning!
There are certain files or webpages that can be found in many websites, and some of these can reveal more information to us in order to aid in our reconnaissance. Therefore, it's always worth a try to check whether the website you are testing has these pages, in order to give your recon data a little boost.
Here are a few common examples:
This is a classic example for this recon technique. The /robots.txt file is meant to be read by web crawlers (used by search engines to find pages) which pages it should NOT visit. This helps to reduce traffic from bots for certain endpoints that may be resource intensive.
Here is an example in a real-world website: google.com/robots.txt
This is a file that you may find on websites that make use of git for version control. If the developer mistakenly allows access to the .git directory, you would be able to view this .git/HEAD page. Such issues can be very severe as it could leak sensitive information and source code as a lot of information can be exfiltrated from the .git directory
Here are some real world examples: Nissan Oopsies eBay Japan
A sitemap is provided by developers in order to help search engine crawl their sites more intelligently. This improves their chances of appearing higher up in search engine searches, which helps bring traffic to their websites.
Here is an example in a real-world website: GovTech Singapore
Hopefully you have some understanding of the importance of web recon, and how you can perform it at a basic level.
Now let's test your understanding of this concept! This website you are viewing has a common filepath that can give you more information for reconnaissance. It is not in the list of examples I've provided in the article, so you should do some research to find more common filepaths to search
To give you a hint, the path looks like so /s_______.txt, where each underscore corresponds with a letter I've hidden. Once you find this file you'll be lead to the next and final stage. All the best!
Quick tip: You may be able to find many lists online from bug bounty hunters or web hackers, which may contain this well-known filename you are looking for