| User | Post |
|
4:36 pm February 10, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
Operation or automation tasks sometimes is an after-thought at the end of development. For Solr development, it’s actually not that bad to think about automation at the very end. Solr provides a set of very useful scripts to make automation easy. You can consider yourself lucky if you are short on time to build automation. I will first talk about basic architecture with Solr and then I will dive into leveraging Solr’s distribtion and operation scripts.
The most basic form of architecture for a Solr-based application only require a single application server. Assuming you develop in Java, you can have both Solr and your webapp served by the same application server. A more common and effective architecture would involve an dedicated indexing server (or indexer) and one or more slave index servers. The idea is to separate all index building work from normal queries. Conceptually, this is similar to database clustering where you have a read/write server as master and read-only servers as slaves.
The following set up involves Tomcat, Apache and Linux assuming Solr’s home is under /solr on every Solr servers.
Note: you may be able to replicate similar configuration on a Windows environment running Cygwin. I haven’t tried it on Windows yet so YMMV.
user=solr
solr_hostname=slave1
solr_port=8080
rsyncd_port=18080
data_dir=data
webapp_name=solr
master_host=indexer
master_data_dir=/solr/data
master_status_dir=/solr/logs
SSH set up
- Solr uses SSH and Rsync in its index distrubtion scripts so we need to make sure SSH keys are configured and public keys are exchanged between indexer and slave index servers. If you haven’t configured SSH key yet, use the ssh-keygen command to generate public/private key pair on every Solr servers.
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/solr/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/solr/.ssh/id_rsa.
Your public key has been saved in /home/solr/.ssh/id_rsa.pub.
The key fingerprint is:
0c:27:27:f5:81:36:87:82:0f:4f:39:b5:aa:fd:e4:2f solr@solr
Exchange public key between indexer and slave index servers
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 644 ~/.ssh/authorized_keys
$ ssh solr@indexer "cat .ssh/id_rsa.pub" >> ~/.ssh/authorized_keys
Rsyncd set up
- Solr uses rsync for index distribution so you need to make sure rsync is functional in your operating system. Start Rsyncd the first time with following commands:
$ /solr/bin/rsyncd-enable
$ /solr/bin/rsyncd-start
Configure Solr to automatically generate a snapshot after optimize. Update solr/conf/solrconfig.xml with following:
<listener event="postOptimize" class="solr.RunExecutableListener">
<str name="exe">/solr/bin/snapshooter</str>
<str name="dir">/solr/bin/</str>
<bool name="wait">true</bool>
</listener>
Enable snapshot pulling on slave servers:
$ /usr/bin/snappuller-enable
Set up snapshot pulling on slave servers at 3am in cron:
0 3 * * * /solr/bin/snappuller; /solr/bin/snapinstaller; /solr/bin/snapcleaner -N 3
OPTIONAL: set up Apache load balancing of your slave index servers (running Tomcat), update /etc/conf/httpd.conf with following:
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
....
<VirtualHost *:80>
ProxyRequests Off
ProxyPreserveHost On
ProxyPass / balancer://tomcats/ stickysession=JSESSIONID lbmethod=byrequests
ProxyPassReverse / balancer://tomcats/
<Proxy balancer://tomcats>
BalancerMember ajp://slave1:8080 route=jvm1 loadfactor=20
BalancerMember ajp://slave2:8080 route=jvm2 loadfactor=20
</Proxy>
</VirtualHost>
All indexing work should be done on your indexer. When you issue the optimize command, Solr will automatically generate a snapshot. Snapshot should be generate well ahead of the scheduled snapshot pulling time (3am in this case). Apache load balancing is optional if you only have one slave server or you have other load balancing solution.
Reference links:
http://wiki.apache.org/solr/CollectionDistribution
http://wiki.apache.org/solr/SolrOperationsTools
Read original blog post
|
|
|
7:23 pm March 9, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
Hello, after reading your posts I think maybe you could help me.
I am running Solr 1.3 and Tomcat 5.5 and I am having a huge problem with collection distribution.
I do snapshots of my index always optimized (so, as my index change fast, the snapshots will always be more or less the same size, all hard links will have the full index size).
I tell snapcleaner to leave just 2 snapshots of my index. Snapcleaner deletes them … and if I do a “du -m solr_data_folder” the folder has deleted the snapshots and the size is of just 2 of them .
The crazy thing is that if I do a “df -m” to the partition where I have the data,it is using a lot of disk space (like if no snapshot was never deleted). The more snapshots I do, the more the partition grows (I say again, snapshots are deleted but the “df” command shows a lot of space in use). As I run cron jobs to do all the process I end with “no space left on device error”
If I restart Tomcat and make a “df -m” the space is available again, it's really weird!
It seems that even the snapshots are being deleted, Tomcat keeps the hardlinks pointing I don't know were.
As you were writing about Solr replication I thought that maybe you could know what is going on.
Sorry for my english, it's not my mother tongue, I am spanish.
I would apreciate any advice you could give me.
Thanks in advance.
Marc
http://www.marcsturlese.com
|
|
|
12:16 am March 10, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
Are you running in a Windows environment? Windows tends to hold lock to files accessed in process until the process explicitly release the lock or when the process exits. I think a Tomcat restart is inevitable.
This would be a very weird behavior if this is happening in a Linux environment. I don't think it can happen on Linux. Please provide more information regarding your setup. Hopefully I can help you to solve the issue.
|
|
|
5:53 am March 10, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
Hey Ed, first of all thanks for answering me.
I am running on Debian. With Tomcat 5.5 and Java1.6. I am using Solr CollectionDistribution, not SolrReplication.
As I said, I do snapshots with the index optimized (and keep always just 2 snapshots). So, every hardlink will have the size of the whole index (because almost all files change after every optimizaton).
du -m command will show me solr “data” folder has the correct size (the size of 2) but “df -m” thinks that no snapshot was free.
I have thought that maybe what is happening is that even if I am deleteting snapshots via deleteting the hard-links, tomcat maybe has something like a pointer that points to the i-nodes of the these snapshots that are suposed to be deleted (and because they have this “tomcat poiner” they will never be deleted).
Once I restart tomcat these “pointers” are removed and consecuently the memory is free.
If that is the case… I have no clue how to sort it. Do you think it does make sense?
Thanks in advance Ed.
Marc.
|
|
|
9:41 am March 10, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
Post edited 1:43 pm – March 10, 2009 by admin
This is in fact a very odd behavior which I haven't seen before. There is a slim possibility that your optimize command might have hung and your JVM still holding file handles on your old index files. Try this command to see if the number of opened index files in Tomcat matches the files in your current index directory.
lsof -p <Tomcat process id>
We are getting somewhere if Tomcat is in fact holding too many opened files.
Btw, I don't use SolrReplication either. I'm more of a script person and more comfortable with croning my scripts to handle logistics.
-Ed
|
|
|
10:10 am March 10, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
Post edited 3:18 pm – March 10, 2009 by sturlese
Hey Ed,
I think I have some clue:
http://stackoverflow.com/quest…..-diskspace
lsof is showing me that tomcat still have instanciated all old snapshots. I think i could free space disk modifying the snapcleaner script. I would do
cp /dev/null snapshot_folder
just before de rm -rf snapshot_folder.
Doing that I think I would free the disk space but Tomcat still would be holding too many files (of 0 bytes) so sooner or later I think I would finish with a “to many opened files” error.
Do you think is a Solr replication scripts bug? Maybe a bad way to close IndexWriter?or… is there any way to free deleted files holded by Tomcat?
This problem is really driving me mad…
Thanks in advanced
|
|
|
4:41 am March 11, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
Any clue?
Still stuck in there… and in Solr mailing lists no one knows about this problem…
Thanks in advance.
|
|
|
2:49 pm March 11, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
So your script change didn't help either? One thing you can try is configuring postOptimize in solrconfig.xml to trigger snapshooter (if you haven't done so already).
Lucene will definitely hold on to the file handles when the IndexReader is still opened. Are you running multiple cores or slaves on the same box that's still reading the old snapshots? After an optimize command, a new set of index files should be produced. Any file handles to old snapshots should have been closed.
|
|
|
6:49 pm March 11, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
After lots of tracing the problem is scaring me even more… I have realized that if I do an import with DataImportHandler with an index optimization and after I do another import with optimize aswell, lsof will show me that tomcat is still holding the deleted files from the old index.
I think that the problem could be in the commit void of the class DataImportHandler2.java but can't hit with it…(maybe cause it's not there but everything seems to point there…). Looks like an IndexWriter or an IndexSearcher is not proper closed
Another think that comes to my mind is that I am using lucene2.9-dev, will have to try with the new oficial release 2.4.1
Last test will be to try if this happens with jetty too or just with tomcat.
If any suggestion or idea comes to your mind please let me know…
|
|
|
11:11 am March 12, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
It does sound like a bug in the development version. Can you try the stable release and let me know if you still experience the same issue? I'm currently using Solr 1.3 and Lucene 2.4.
|
|
|
1:13 pm March 17, 2009
| sturlese
Member
| | | |
|
| posts 6 |
|
|
It's definitely a developers version bug. I need to keep using it due to new facet algorithm and other new stuff so I have reported the bug (and solved it for my use case, but haven't found the proper solution). In case you have interest to follow the issue:
https://issues.apache.org/jira/browse/SOLR-1070
Thank you very much.
Marc
http://www.marcsturlese.com
|
|
|
2:02 pm March 17, 2009
| admin
Admin
| | | |
|
| posts 17 |
|
|
That's a good find, Marc. I'm sure the Solr dev team appreciates. I think your workaround in forcing the IndexSearcher to close is perfectly fine if you are using a dedicated server for indexing. For me, I always set up a separate indexing server so I can mess around with it without any adverse effect on query servers.
|
|