Welcome to 16892 Developer Community-Open, Learning,Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Problem

I'm building a postgres database for a few hundred thousand products. I will set-up an index (Solr or maybe ElasticSearch) to improve query times for complex search queries.

The point now is how to let the index synchronized with the database?

In the past I had a kind of application that polled the database periodically to check for updates that should be done, but I would have an outdated index state time (from the database update to the index update pull).

I would prefer a solution in which the database would notify my application (java application) that something has been changed within the database, and at that point the application will decide if the index needs to be updated or not. To be more accurate, I would build a kind of producer and consumer structure in wish the replica will receive notifications from postgres that something changed, if this is pertinent to the data indexed it is stored in a stack of updates-to-do. The consumer would consume this stack and build the documents to be stored into the index.

Possible Solutions

One solution would be to write a kind of replica end-point in which the application would behave as a postgres instance that is being used to replicate the data from the original database. Do someone have some experience with this approach?

Which other solution do I have for this problem?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.7k views
Welcome To Ask or Share your Answers For Others

1 Answer

Which other solution do I have for this problem?

Use LISTEN and NOTIFY to tell your app that things have changed.

You can send the NOTIFY from a trigger that also records changes in a queue table.

You'll need a PgJDBC connection that has sent a LISTEN for the event(s) you're using. It must poll the database by sending periodic empty queries ("") if you're using SSL; if you are not using SSL this can be avoided by use of the async notification checks. You'll need to unwrap the Connection object from your connection pool to be able to cast the underlying connection to a PgConnection to use listen/notify with. See related answer

The producer/consumer bit will be harder. To have multiple crash-safe concurrent consumers in PostgreSQL you need to use advisory locking with pg_try_advisory_lock(...). If you don't need concurrent consumers then it's easy, you just SELECT ... LIMIT 1 FOR UPDATE a row at a time.

Hopefully 9.4 will include an easier method of skipping locked rows with FOR UPDATE, as there's work in development for it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to 16892 Developer Community-Open, Learning and Share
...