Fixing common issues when reindexing data on Solr
I have recently set up a legacy project which is running on Rails 4 and Solr 5. It uses sunspot_rails
gem to interact with Solr server. Because the project is pretty old with deprecated libraries, I decided to use Docker to run it on my local. Everything seems fine until I started reindexing the data. I searched on the internet for solutions but it seems everyone having different issues and no-one got the same issues like me. It took me several hours to sort it out.
Here are a few issues I met and resolved. I hope it could help someone save their day.
1. Running reindex gives exception but not specific error message is provided, no logging shown on solr
This is mostly because you are not configuring the correct endpoint of your Solr server. Here is the content of the original sunspot.yml
on my project
development:
solr:
solr_home: solr
hostname: localhost
port: 8982
path: /solr/development
This config does not work because:
+ I'm using docker, so the hostname
should be the name of the service I'm configuring in docker-compose.yml. In my case, it should be solr
+ My Solr is not running on port 8982
. Check your port, make sure your Solr is running on port 8983 and make sure it's the same everywhere.
Here is my final content of sunspot.yml
which works.
development:
solr:
solr_home: solr
hostname: solr
port: 8983
path: /solr/development
2. Illegal character ... in Solr server logging
This is because of the protocol sunspot
using when sending requests to Solr. When you see the exception, pay attention to the scheme of the URI, is it http
or https
? In my case, it was producing illegal character...
error because it was trying to connect using https. Obviously that's not working.
To fix the issue, make sure you do not specify scheme: https
in your sunspot.yml
. You might also want to check if there is any ENV variable like SOLR_URL
or WEBSOLR_URL
configured which is in different scheme
3. Undefined field "type" (or class_name
)
This is because the default schema created by Solr is not correct. In my case, I had to look for the default content of Solr 5 here: https://github.com/sunspot/sunspot/blob/99f2a0d0945e4e6e2f81352c2af0effa6b71d121/sunspot_solr/solr/solr/configsets/sunspot/conf/schema.xml, and then I copied its content, paste it into the managed-schema
file inside my core. If your core is named development
, it will be at <your_solr_data_dir>/development/conf/managed-schema
After replacing the file content, you should restart your solr container and it would work fine.
Tips
You might want to index a specific model in the rails console
to test. Find which model is configured to be searchable on Solr and then run:
Model.index
For example, I have a model called Store
, I can simply run
Store.index
It often helps me to debug the issue easier following this approach than running the rake for reindexing