If you use Domino’s HTTP GZIP compression, you may have issues with your Google Search Appliance reporting:
Malformed HTTP header: empty content
It appears that the Google box will tell Domino that it accepts compressed content in the HTTP headers even though it is unable to process the content. Luckily I found an easy solution on the Google Search Appliance support forum.
The solution is to add headers to the Google Box’s http requests.
- Open the GSA admin interface.
- Select ‘Crawl and Index‘ -> ‘HTTP Headers‘
- In the ‘Additional HTTP Headers for Crawler’ enter:
Accept-Encoding: *;q=0
Posted in Domino, Google Search Appliance
We use a Google Search Appliance to index our Domino websites. We use Google’s OneBox module to execute FT searches to return results from a database we don’t want Google to index (basically 80,000 journal records we don’t want to be counted against our license total).
Yesterday we upgraded our Search Appliance to the latest version of Google’s software (6.2.0.G14). About an hour after switching it on, one of our Domino servers was hit by a sustained denial of service. The Agent called by the Google OneBox module had saturated all our HTTP threads. The Google box was making continuous requests to the search agent at a rate of nearly 200,000 per hour. The odd thing was, Google’s OneBox module passes across the end-user’s IP address as part of the search query and all requests were coming from 216.239.43.1 – a Google IP address.
It appears that there’s a known issue with the OneBox module that can cause this. (Bug report #2368523). Google immediately applied the patch, and after an hour the requests had stopped.
If you intend to upgrade to 6.2. I suggest you remove all OneBox modules from your front-ends before upgrading
So related to this problem, I find it easy to create a DOS on a Domino server. Calling any agent that takes a second to return results continuously creates a DOS (it might simply be a case of holding F5). Our server’s are set up as per Lotus’s recommendations. But does anyone have any tips for optimising the Domino http stack (Solaris) to avoid DOS?
Posted in Domino
A word of warning if you use a Google Search Appliance to index your Domino content. The latest version of the appliance software (5.2.0.G32) has a bug which means it is now case sensitive when checking for Domino rewrite and ignore rules. For example it will only exclude agents if they are correctly capitalised. i.e OpenAgent. If you have agents which might cause you some problems (say agents that delete content or send emails), make sure you add ignore statements to the exception lists before starting the index. Luckily our Google box runs with ‘student’ access so couldn’t do any damage!
Posted in Domino