Migrating a Rails app from MySQL with Spatial Extensions to PostgreSQL with PostGIS
There’s much documentation about converting a Rails app from MySQL to PostgreSQL (if you haven’t written any SQL in your code, you can often just use Taps), but very little about converting a Rails app (or any app, actually) from using MySQL Spatial Extensions to using PostGIS.
In my case, I’m working on a Rails 2.3.14 app using the spatial_adapter, geokit, geokit-rails and georuby gems. If you’re using Rails 3 with rgeo and activerecord-postgis-adapter, these tips will probably help you, too.
Code changes
If you wrote any SQL in your code (or if you’re not sure), you may have to update it to work with PostgreSQL. Code snippets to search for:
- count_by_sql, find_by_sql, delete_all, destroy_all, update_all, exists?
:select,:from,:joins,:conditions,:order,:groupand:havingoptions using MySQL functions or syntax- has_many_and_belongs_to and has_many associations with
:finder_sql,:counter_sql,:delete_sqlor:insert_sqloptions - connection, which can be called from
ActiveRecord::Baseor any model, and any of its methods like execute, select_rows and others
Read things to know when moving from MySQL to PostgreSQL to learn that, for example:
- PostgreSQL uses single quotes to quote values, never double quotes.
- PostgreSQL use double quotes, not backticks, to quote system identifiers.
- PostgreSQL is case-sensitive for string comparisons.
- All non-aggregate SELECT columns must appear in the GROUP BY clause.
- An INNER JOIN without an ON clause must become a CROSS JOIN.
- Column aliases in SELECT clauses require the AS keyword.
Using Jinja2 in Django with Coffin: The Easy Way
There is a lot of old, incorrect and just plain bad documentation on how to use the Jinja2 templating language in Django. Here’s how to do it The Easy Way.
Install Jinja2, Pygments and Coffin with
easy_install Jinja2 Pygments Coffinor by editing your Pip requirements.txt file.In
views.py, change:from django.shortcuts import render_to_responseto:
from coffin.shortcuts import render_to_responseUpdate your templates to use Jinja2 syntax. Change
block.supertosuper().If you use
{% compress %}tags, add:JINJA2_EXTENSIONS = [ 'compressor.contrib.jinja2ext.CompressorExtension', ]to your
settings.pyas described here.
Fix Facebook Like button for IE6 and IE7
You may have noticed in your server logs or analytics a significant number of requests to pages on your site with the query string ?fb_xd_fragment=. This behaviour is a symptom of the Facebook Like button working improperly in IE6 and IE7. In order to correct this bug, you must set a channelUrl for your Facebook Like button, following Facebook’s documentation on FB.init.
window.fbAsyncInit = function() {
FB.init({
appId : 'YOUR_APP_ID',
status : true,
cookie : true,
oauth : true,
xfbml : true,
channelUrl : 'YOUR_CHANNEL_URL'
});
};
(function(d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) {return;}
js = d.createElement(s); js.id = id;
js.src = "//connect.facebook.net/en_US/all.js#appId=YOUR_APP_ID&xfbml=1";
fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));
A few notes on the above: As documented by Facebook, “the function assigned to window.fbAsyncInit is run as soon as the Facebook SDK is loaded.” Importantly, YOUR_CHANNEL_URL should be a URL that returns a single line of HTML:
<script src="http://connect.facebook.net/en_US/all.js"></script>
Google News Search parameters (The Missing Manual)
There is a severe lack of documentation of Google News Search’s query parameters. Here is the most comprehensive list I am aware of. A number of these overlap with Google Search’s query parameters. I exclude any Google Search parameters that don’t make sense in the context of Google News Search. These parameters are primarily of interest to anyone querying Google News for articles using its RSS and Atom output formats.
Search
The following query parameters control Google’s interpretation of keywords. Although as_epq may be attractive for some use cases, it breaks sorting by date (scoring=n) when output is “rss” or “atom”, so stick with q for consistent sorting. Update: It seems the as_* parameters don’t work when output is “rss” or “atom”.
q=TERMSretrieve results with all of the terms. Alias:as_qas_epq=TERMSretrieve results with the exact phraseas_oq=TERMSretrieve results with at least one of the wordsas_eq=TERMSretrieve results without the termsas_occt=SECTIONretrieve results where my terms occur…anyanywhere in the articletitlein the headline of the article (same as using “allintitle:” inq)bodyin the body of the article (same as using “allintext:” inq)urlin the URL of the article (same as using “allinurl:” inq)
Filter
The following query parameters filter results by Google News edition, topic, location, date, news source or author.
ned=EDITIONlimits results to a specific edition. Possible values: editionstopic=TOPIClimits results to a specific topic. Possible values: topicsgeo=LOCATIONlimits results to a specific locationdetect_metro_areadetermines location based on IP- a city, state, country, or US zip code
as_drrb=qretrieves articles added by Google News…as_qdr=aanytimeas_qdr=hlast houras_qdr=dlast dayas_qdr=wpast weekas_qdr=mpast monthas_qdr=ypast year
as_drrb=bretrieves articles added by Google News between…as_minm=NUMminimum month. Possible values: [1, 12]as_mind=NUMminimum day. Possible values: [1, 31]as_maxm=NUMmaximum month. Possible values: [1, 12]as_maxd=NUMmaximum day. Possible values: [1, 31]
as_nsrc=SOURCElimits results to a specific news source (same as using “source:” inq)as_nloc=LOCATIONlimits results to news sources from a specific location (same as using “location:” inq)as_authorlimits results to a specific author (same as using “author:” inq)
The as_ddrb family of parameters is occasionally set using the tbs parameter when using the web interface. You do not need to learn the tbs syntax. Note that as_nloc and geo are not synonymous.
Boost
I’m not confident that these parameters do anything when output is “rss” or “atom”, but here they are for completeness.
gl=COUNTRYboosts search results from a specific country of origin. Possible values: country codesgll=LATITUDE,LONGITUDEboosts search results near that point. Latitude and longitude must be integer microdegrees. In otherwords, multiply each number by a million and round to the nearest integer.gr=REGIONboosts search results from a specific region. Possible values: province codesgm=METROboosts search results from a specific metropolitan area. Possible values: metro codesgpc=ZIPCODEboosts search results from a specific zip code.glmust be “us”.gcs=CITYboosts search results from a specific city. Possible values: city names in the United States and worldwide
Order
scoring=ORDERsorts search results. Default: “r”. Alias: “as_scoring”- “r” by relevance
- “n” by date (newest first)
- “d” by date (newest first) with duplicates
- “o” by date (oldest first)
Paginate
num=NUMretrieves NUM results. Default: 10. Possible values: ifqpresent [1,100], otherwise [1, 30]start=OFFSETretrieves results starting from OFFSET. NUM plus OFFSET must be less than 1000, otherwise you will get zero results. Ignored ifoutputis “rss” or “atom”. Default: 0. Requiresq.
Output
output=FORMATsets the output formatrssretrieves RSS feedatomretrieves Atom feed
hl=LANGUAGEsets host language. Default: “us”. Possible values: languageshdlOnly=1displays headlines onlyqsid=IDused in combination withcf=q. Update: This feature has been removed.
In older versions of Google News, it was possible to change the output of the web interface using a cf parameters. It no longer seems to work, but it is here for completeness. Note that when output is set to “rss” or “atom”, this parameter is in fact ignored.
cf=CODEallretrieve any contentqretrieve only quotes. Requiresqsid. Update: This feature has been removed.iretrieve only imagesbretrieve only blogs
Google tracking
Google uses parameters to track how users are using the web interface, which may include aq, authuser, btnmeta_news_search, edchanged, client, rls, oi, oq, resnum, sa, source, sourceid, swrnum, tab. You don’t need to worry about these, unless you want to bias Google’s internal statistics on user behavior.
Google Search
If using the web interface, performing a keyword search from Google News will redirect you to a Google Search page. The tbm=nws query parameter informs Google to display news results only.
Undocumented
I haven’t yet figured out what these do, but their impact seems minimal. If you have a clue, please mention it in the comments!
pzis usually set to 1. Default: 1. Possible values: [0, 1]ictPossible values: “ln”, “itn0”, “tnv0”csid
Deprecated
The following parameters work only on the deprecated Google News Search API.
v=1.0sets the API version. Possible values: “1.0”rsz=SIZEsets results sizesmallretrieves four resultslargeretrieves eight results
useripsets user’s IP as an abuse counter-mesurecallbackruns JavaScript callbackcontextsets callback contextkeysets API key
Sources
Google Refine fingerprint clustering algorithm in Ruby
If you’ve used Google Refine, you know how useful its clustering algorithms are for finding and merging alternative representations of the same thing, e.g. “Gödel, Escher, Bach”, “Godel, Escher, Bach” (accents), “Gödel Escher Bach” (punctuation), “gödel, escher, bach” (case).
In my open data and data journalism projects, I perform similar data consolidation and reconciliation steps to avoid unwanted duplicates. Following Google Refine’s algorithm description, I wrote this snippet that adds a #fingerprint method to Ruby’s String class.
Remove all accents and diacritics from string in Ruby
As a follow-up to this post, if you want to remove all accents and diacritics from a text string in Ruby, while preserving case, use this snippet:
Reserved methods in Rails
I occasionally waste many hours debugging code that fails because I accidentally redefine a method inherited from a Rails class. For future reference:
ActionMailer
How to cache an internationalized site with Rack::Cache in Rails 3
I have bilingual and international websites that serve different translated content depending on the current locale. The current locale may be set according to the domain, subdomain, path, query string, or a cookie. If it is set according to a cookie, read on! If it is set using anything in the URL, Rails 3 (using Rack::Cache internally) will cache pages correctly for you.
Now, in my cookie-based site, I want to implement caching. However, if I naively set a Cache-Control header on a page, I’ll end up caching a single language’s content, and serving that language’s content to all visitors, independent of their current locale! I can’t have French visitors reading English just because an English visitor was the first to visit the page.
Rails 3 uses the Rack::Cache middleware internally for caching. Rack::Cache generates a “cache key” based on the URL used to access a page. You can read the code to generate the cache key. It then looks up that key in its cache. If it exists, it returns the cached content. If not, Rails serves the request, and Rack::Cache caches the response under that key for next time.
We can solve the above caching problem by setting a custom cache key for Rack::Cache that includes the current locale, so that we cache one version of the page for each language. The code to generate the default cache key is:
Rack::Cache::Key.new(request).generate
If you follow all documentation on this subject, you’ll probably write something like this in production.rb to set a custom cache key:
config.middleware.use Rack::Cache,
:metastore => 'file:/var/cache/rack/meta',
:entitystore => 'file:/var/cache/rack/body',
:cache_key => Proc.new { |request|
[I18n.locale, ':', Rack::Cache::Key.new(request).generate].join
}
This won’t work! If you read the code for ActionDispatch’s Railtie, you’ll see the line:
config.action_dispatch.rack_cache = {
:metastore => "rails:/",
:entitystore => "rails:/",
:verbose => true,
}
The solution is then to set the custom cache key in production.rb like this:
config.action_dispatch.rack_cache[:cache_key] = Proc.new { |request|
[I18n.locale, ':', Rack::Cache::Key.new(request).generate].join
}
Tada!
How to safely create friendly From addresses including name and email
You have a contact form that collects a person’s name and email address. When sending their email, you want to get these two pieces of information from looking like this:
Name: John Doe Email: john@example.com
to this:
From: John Doe <john@example.com>
or:
Reply-To: John Doe <john@example.com>
If you simple concatenate strings, then you will either send a malformed email or introduce a security risk. Here’s a safe, proper way of doing it using the Mail gem: