diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..fa2a6ea --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,19 @@ +# CHANGELOG + +**0.4.0** + +* Features + * Added support for RMLS (Portland) MLS +* Bug fixes + * Don't assume there's a response in errors from axios +* General improvements + * Downloads are generally more resilient, retrying 3 times, whether the response is non-200 (as in, not successful), or if the response is invalid JSON + * Move more responsibilities into the platform adapters so that functionality can be different for each platform + * The directory `config/user` is ignored by git, so you can put any files in there that you want, such as JavaScript files that you `require()` from your `config/config.js`. +* User config API + * Minor version bump from 0.3.0 to 0.4.0 + * For the MySQL destination, the `transform()` function now has a `cache` parameter, just like the Solr destination. This allows you to more efficiently do lookups, etc, because you won't lose those lookups between calls to `transform`. For more, see where `transform()` is mentioned in [config.example.js](config/config.example.js). + +**OLDER** + +For older releases, the authoritative source is the source code. diff --git a/README.md b/README.md index ee43f7f..85be838 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ![OpenReSync logo](https://user-images.githubusercontent.com/366538/144324447-cfa1c275-9bad-47d7-aff5-57a3af297f48.png) -Open Real Estate Sync (openresync) is a node application that syncs (replicates) MLS data from one or more sources via the RESO Web API (such as Trestle, MLS Grid, or Bridge Interactive) to one or more destinations (MySQL and Solr are supported so far, and it's easy enough to add more), and allows you to see the sync status via a local website. +Open Real Estate Sync (OpenRESync) is a node application that syncs (aka replicates) MLS data from one or more sources via the RESO Web API (such as Trestle, MLS Grid, RMLS, or Bridge Interactive) to one or more destinations (MySQL and Solr are supported so far, and it's easy enough to add more), and allows you to see the sync status via a local admin website. It is meant to be used by developers. @@ -16,6 +16,14 @@ It does the syncing for you, and helps you answer these questions: Want to convince your boss to use this tool, so that you can actually spend your time working on your team's product instead of syncing MLS data? Send them to https://openresync.com. +## Chat + +There is a discord server available. Anyone can ask questions, and developers can get technical support. + +Here's an invite: [https://discord.gg/EgbgSKRB93](https://discord.gg/EgbgSKRB93) + +It's a temporary invite (to prevent bots and abuse), which means that you will be kicked when you disconnect, but while you're in there, just request a permanent invite. + ## Project status This project is in alpha status. It is meant for those who could benefit from what it does enough to offset the downside of likely shortcomings. @@ -97,7 +105,7 @@ Then visit the website at http://localhost:3461 **Here's how to run it in production:** ``` -$ NODE_ENV=production TZ=UTC node server/index.js +$ TZ=UTC node server/index.js ``` Then visit the website at http://localhost:4000 @@ -119,12 +127,6 @@ See the heavily commented `config/config.example.js`. Copy it to `config/config. There is an internal configuration file that you should be aware of, which is described in the "How does it work?" section. -### .env - -It's recommended to put secrets in a `.env` file. These will be read automatically using the `dotenv` library and available for your config file in `process.env` values. - -There's no `example.env` type of file because there are no standard fields you should configure. For example, in a project that uses the Austin Board of Realtors sample dataset, you might use environment variables like ABOR_CLIENT_ID and ABOR_CLIENT_SECRET to store your Oauth credentials, and then you could reference them with e.g. `process.env.ABOR_CLIENT_ID` in your `config/config.js` file. There's no particular recommendation other than you keep your secrets out of the git repository. - ## How does it work? You should know these basics so you can debug problems. @@ -163,7 +165,7 @@ For each of the processes, the state is recorded in the internal config file (di #### Sync -The sync process adds or updates records (no deletions). To know what to download from the MLS, the destination (or first destination, if you have multiple) is queried to get the most recent timestamp. Then, the MLS is queried for records with a newer timestamp. If there is an error when downloading the files, then when the sync process is next run, instead of querying the destination for the newest record, it will look newest record in the newest file. +The sync process adds or updates records (no deletions). To know what to download from the MLS, the destination (or first destination, if you have multiple) is queried to get the most recent timestamp. Then, the MLS is queried for records with a newer timestamp. If there is an error when downloading the files, then when the sync process is next run, instead of querying the destination for the newest record, it will look at the newest record in the newest file. Note that the first sync might take hours depending on the platform and number of records in the MLS and if you filter any out. However, subsequent syncs generally run quite quickly so it's reasonable to run it, say, every 15 minutes. @@ -242,7 +244,7 @@ Another advantage of syncing the data and creating your own API is you basically ## Known limitations -1. One of the main value propositions of this application is to make it robust in error handling. It is desired that the application not crash and wisely show error situations to the user. However, this has not been tested thoroughly. Some errors might be swallowed altogether. Some errors are quite verbose and we don't shorten these yet. It would definitely be great to catch 502 and 504 errors from the platforms and retry downloads, but this is not done yet. +1. One of the main value propositions of this application is to make it robust in error handling. It is desired that the application not crash and wisely show error situations to the user. However, this has not been tested thoroughly. Some errors might be swallowed altogether. Some errors are quite verbose and we don't shorten these yet. 1. The Solr data adapter does not yet sync/manage the schema for you (though the MySQL data adapter does). ## Roadmap diff --git a/config/config.example.js b/config/config.example.js index b91a792..c28b77a 100644 --- a/config/config.example.js +++ b/config/config.example.js @@ -228,9 +228,13 @@ const bridgeExample = { // makeFieldName option. But this function would allow you to modify the data in any way, for example change // keys, values, add key/value pairs, remove some, etc. It takes the MLS resource name, the record as // downloaded, the metadata, and, if the record is from an $expand'ed resource, the (potentially transformed) - // parent object, and should return an object. Do not mutate the record passed in. Note: For the primary key's - // value, you may return null if your table's primary key is auto-incremented, which is the default. - // transform: (mlsResourceName, record, metadata, transformedParentRecord) => { + // parent object, and finally, a cache object, and should return an object. Do not mutate the record passed in. + // Note: For the primary key's value, you may return null if your table's primary key is auto-incremented, which + // is the default. + // The "cache" is an (initially empty) object that we pass each time to the transform function, on which you may + // put any data you wish. This allows the transform function to e.g. do lookup work when it chooses; it could do + // it all on the first pass and not again, or it could potentially do it only on-demand somehow. + // transform: (mlsResourceName, record, metadata, transformedParentRecord, cache) => { // // Return an object. Do not mutate record. As in, make a copy, modify and return that. // } }, @@ -282,10 +286,10 @@ const bridgeExample = { // difference between what you want inserted/updated is the field names, then you should use the // makeFieldName option. But this function would allow you to modify the data in any way, for example change // keys, values, add key/value pairs, remove some, etc. It takes the MLS resource name, the record as - // downloaded, and the metadata object, a cache object (described below), and should return an object. - // The "cache" is an (originally an empty) object that we pass each time to the transform function. This - // allows the transform function to do lookup work when it chooses, e.g. it could do it all on the first pass - // and not again, or it could potentially do it only on-demand somehow. + // downloaded, the metadata object, and a cache object, and should return an object. + // The "cache" is an (initially empty) object that we pass each time to the transform function, on which you may + // put any data you wish. This allows the transform function to e.g. do lookup work when it chooses; it could do + // it all on the first pass and not again, or it could potentially do it only on-demand somehow. // TODO: The transformedParentRecord parameter doesn't exist, as it does in the MySQL data adapter, because I // have not personally needed it. Its main motivation is for relational databases and might not be needed in Solr, // which supports nested documents, although it wouldn't hurt to have it. It could be added by request. diff --git a/lib/sync/dataAdapters/mysql/mysql.js b/lib/sync/dataAdapters/mysql/mysql.js index 6cded8d..506d9f7 100644 --- a/lib/sync/dataAdapters/mysql/mysql.js +++ b/lib/sync/dataAdapters/mysql/mysql.js @@ -265,7 +265,7 @@ module.exports = ({ destinationConfig }) => { platformAdapter, makeFieldName, ) - // The "cache" is an (originally an empty) object that we pass each time to the transform function. This allows the + // The "cache" is an (initially empty) object that we pass each time to the transform function. This allows the // transform function to e.g. do lookup work when it chooses, storing it on the cache object. For example, it could // do it all on the first pass and not again, or it could potentially do it only on-demand somehow. But we don't // have to force it to do it at any particular time. diff --git a/lib/sync/dataAdapters/solr/solr.js b/lib/sync/dataAdapters/solr/solr.js index 3ff3e1d..a07341a 100644 --- a/lib/sync/dataAdapters/solr/solr.js +++ b/lib/sync/dataAdapters/solr/solr.js @@ -91,7 +91,7 @@ module.exports = ({ destinationConfig }) => { platformAdapter, makeFieldName, ) - // The "cache" is an (originally an empty) object that we pass each time to the transform function. This allows the + // The "cache" is an (initially empty) object that we pass each time to the transform function. This allows the // transform function to e.g. do lookup work when it chooses, storing it on the cache object. For example, it could // do it all on the first pass and not again, or it could potentially do it only on-demand somehow. But we don't // have to force it to do it at any particular time. diff --git a/package.json b/package.json index b73a5ae..5d63916 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "openresync", - "version": "0.2.0", + "version": "0.4.0", "private": true, "scripts": { "dev": "TZ=UTC nodemon --watch server/ --watch lib/ --watch tests/qa/ --watch tests/fixtures/",