This is the second article about the hacker attack against osCommerce-powered sites. In the first part, you can find the description of the attack along with detection and clean-up instructions. Now I want to show you what exactly hackers did and how they managed to poison Google search results.
The main goal is to demystify hackers and encourage webmasters to explore their own sites. The more you know about hackers, the better you’ll be at protecting your site against their attacks.
This post is based on the files and access logs of three compromised sites that I received from a webmaster who contacted me a couple of weeks ago.
In logs of site-1, I found several POST requests to “/admin/file_manager.php/login.php?a=1&action=save” (on December 9, 10, 16, 17, 18) from several different IPs. Right after those attacks I saw POST requests to newly created files called fly.php (the file that is used in the disclosed exploit — it executes arbitrary PHP code passed as a POST parameter) and flop.php. Apperantly those files provided full access to the site (to directories with write permission). One of such attacks created a file called mm.php (it provides a simple interface to upload files from a local computer to server).
December 21, 2009
12:13 – Hacker with IP 18.104.22.168 uses mm.php to upload an sh1.php file to the /images directory. sh1.php is a web shell. It equips hackers with a sophisticated graphical interface that provides almost full access to compromised sites. It allows hackers to browse directories, create and modify files, execute arbitrary PHP code, work with databases etc.
12:14 – 12:28 – the hacker uses the web shell to explore internals of the site-1.
12:14 – he discovers site-2 under the same account
12:23 – he discovers site-3 under the same account and decides to start the malware campaign there. He uploads sh1.php and bety.php there.
21:05 – The hacker submitted three site-3/bety.php?q=keywords pages to page2rss.com (Page2Rss helps monitor web sites that do not publish feeds).
21:06 – The hacker clicks on the created links on page2rss.com and visits site-3 to check that everything works as intended.
Links on page2rss.com are “nofollowed” but maybe this service somehow pings Google about new feeds, which makes the discovery faster?
21:31 – Googlebot comes to site-3 directly to bety.php pages and starts to index them. Apparently hackers somehow submitted a big batch of bety.php URLs to Google since it’s clear that it didn’t use site-wide discovery (didn’t follow links found in just indexed bety pages).
22:50 – Googlebot finishes indexing bety.php pages. 1976 malicious pages have been indexed.
The indexed pages become immediately available in search results. The first visitor from Google Search comes at 21:47. It is just in 16 minute after Google first discovered the bety pages and started indexing them and in 5 minutes after that visited page had been indexed. And at that time the initial indexing was still underway with more than an hour to go. 10 web surfers had visited the bety pages by the time googlebot left the site.
Some stats on visits from Google:
42 visits on December 21.
129 visits by December 31.
But wait, it’s just the beginning.
08:18 – The hacker with IP 22.214.171.124 returns to site-1 and works with it for about 6(!) hours using the sh1.php web shell. This time he wants to start the “bety” campaigns on site-1 and site-2.
08:22 – he uploads sh1.php and bety.php to site-2.
08:51 – the hacker has someone open the site-2/bety.php?q=so-you-think-you-can-dance-phone-number page using Microsoft Translator service.
09:32 – Googlebot comes to site-2 and starts to index the bety.php pages.
09:58 – first visitor clickes on the bety search result. As you can see, the indexed pages become searchable almost immediately.
10:36 – The first batch of 1592 bety.php pages is indexed. By this time 25 more visitors came to site-2 bety pages via Google search results.
18:38 – One of the bety links somehow makes it to twitter. The same minute Googlebot follows this link.
21:39 – Googlebot visits site-2 again and starts to index another batch of 5150 bety pages. This session lasts till 03:24 – of the next day (almost 6 hours).
Then Googlebot regularly visits site-2 and by the end of month it has indexed 8415 bety pages. As a result, there had been 1353 visits of malicious bety pages from Google search results on December 22, 1878 visits on December 23, and 5734 visits by the end of December.
When Google picked up bety pages on site-2, the attacker switched back to site-1 and triggered the bety campaign there.
December 22, 2009
10:53 – a spammy comment with 438 links to site-1/bety.php?q=keywords pages has been published on my.mail.ru. 11:02 – Someone clickes on those links and opens a couple of bety pages.
12:35 – Googlebot comes to site-1 and starts to index the bety.php pages. It indexes 4887 malicious pages by 16:44.
12:59 – the first visitor from Google search.
466 – visits from Google search results on December 22.
1500 – visits from Goolge on December 23.
3136 – visits from Google by the end of December.
During the last 10 days of December, 2009, this hacker managed to drive 9019 visits from Google to malicious bety pages. (Google was the only source of traffic for those pages.) 7768 times the script that redirects visitors to malicious sites was loaded by web surfers from 4781 unique IPs. Quite impressive, given it only took a few hours of the hacker’s time.
OK. So what does this bety.php do an how it manages to provide Google with so many different variants of pages that it considers worthwhile to show on first pages of search results?
Bety.php handles two types of request q and red.
bety.php?red=keywords requests are used to retrieve the content of lname.php, which is a redirect script, like this:
window.location = "hxxp://basicallyantispyware .net/hitin .php?land=20&affid=33220";
Every 20 minutes, bety.php updates the content of the lname.php file pulling the domain name of the currently active malicious site from
To hide the malicious redirect from search engines, red request handler checks IP addresses of visitors and doesn’t return anything if detects requests from known IP-ranges used by search engine crawlers.
q requests return web pages specially crafted for Google.
When bety.php is opened for the first time, it creates a special directory called .cache (in new version .pages). It is the place where the bety script stores generated web pages.
When processing bety.php?q=keywords requests, the script checks if there is a pages called keywords.html in the cache directory. If it is, this page will be displayed. E.g. for /bety.php?q=2010-nfl-mock-draft request it checks for file .cache/2010-nfl-mock-draft.html.
If the cached file is missing (and initially there are no cached files at all) it is generated on the fly. Here is the structure of the generated files:
Pretty straight forward, isn’t it? Those pages contain many relevant keywords and while they are fresh (first couple of days) Google temporarily boosts their ranking. And for multi-keywords searches this is enough to make it to the first page of results.
What is not clear to me is
What do you think?
Hackers are always on the look out for vulnerable websites that they can use for their malicious activities. As a site owner or webmaster you should be ready to deal with hacker attacks.