Hugo Search With Algolia - Part One

Hugo Search With Algolia - Part One

January 31, 2022·
Mike

Searching on a static site that only serves basic HTML+JS is tricky. The classic model of backend database and app server you might have used in the past isn’t there any more, so there’s no option to query the traditional database. So how does this work in static sites? One option is to make use of the APIs and services from Algolia.

Algolia provides the smarts for the search once you’ve provided the indexed content. Then your static site sends an API request with a search term and they send back the results from what they already know. It’s like searching Google; your browser doesn’t query every site on the internet, it uses the Google search service and returns what they have collected in the past.

Now you don’t have to use Algolia or similar remote services. When you run hugo to build your site, your config can create a local index file that your theme can use. You do miss out on some useful information with a local search index. Items such as search terms, misses and usage metrics aren’t captured like they can be with a remote service. But a local index is good for small sites or for privacy concerns (which might also need a site to have more complex features like logins anyway).

In Part One of this short series, we’ll create the index file for the content and submit it to Algolia.

Index Creation

You need an index file to be generated with the content you want to be searchable. Content titles, summaries, authors, and the body text are all things containing info that people would search for. So we need to create a JSON file that includes all that data.

In your config.toml file you want to add an output option which is created when you run the build process. Add or modify existing content to include the below.

[outputs]
home = ["HTML", "RSS", "JSON", "Algolia"]

[outputFormats.Algolia]
baseName = "algolia"
isPlainText = true
mediaType = "application/json"
notAlternative = true

[params.algolia]
vars = ["title", "summary", "date", "publishdate", "expirydate", "permalink", "image"]
params = ["tags"]

You should be able to see how the new Algolia item in the outputs list then links to the outputFormats information and the parameters to define attributes of your content that will be involved. The type of content here is blog posts.

It’s important to note that it’s those vars that define what goes into the index and that’s what will be submitted to Algolia and used for searches. So with this config, don’t expect to use a search term like “frogs” if that’s not in the generated public/algolia.json index file.

You do need more than those couple of config blocks to make an index. Create a new file called layouts/_default/list.algolia.json and we’ll fill that with the details on exactly what attributes to capture from what content source in your static site.

{{/* Generates a valid Algolia search index */}}
{{- $hits := slice -}}
{{- $section := $.Site.GetPage "section" .Section }}
{{- $validVars := $.Param "algolia.vars" | default slice -}}
{{- $validParams := $.Param "algolia.params" | default slice -}}
{{/* Include type of content? Change 'blog' based on your content. */}}
{{- range $i, $hit := where (where .Site.Pages "Type" "in" (slice "blog")) "IsPage" true -}}
  {{- $dot := . -}}
  {{- if or (and ($hit.IsDescendant $section) (and (not $hit.Draft) (not $hit.Params.private))) $section.IsHome -}}
    {{/* We need objectID as something unique for Algolia */}}
    {{- .Scratch.SetInMap $hit.File.Path "objectID" $hit.File.UniqueID -}}
    {{/* Keep the page attributes you need in an iterable object */}}
    {{- .Scratch.SetInMap "temp" "content" $hit.Plain -}}
    {{- .Scratch.SetInMap "temp" "date" $hit.Date.UTC.Unix -}}
    {{- .Scratch.SetInMap "temp" "description" $hit.Description -}}
    {{- .Scratch.SetInMap "temp" "image" $hit.Params.Image -}}
    {{- .Scratch.SetInMap "temp" "dir" $hit.File.Dir -}}
    {{- .Scratch.SetInMap "temp" "path" "temp" -}}
    {{- .Scratch.SetInMap "temp" "expirydate" $hit.ExpiryDate.UTC.Unix -}}
    {{- .Scratch.SetInMap "temp" "path" "temp" -}}
    {{- .Scratch.SetInMap "temp" "fuzzywordcount" $hit.FuzzyWordCount -}}
    {{- .Scratch.SetInMap "temp" "keywords" $hit.Keywords -}}
    {{- .Scratch.SetInMap "temp" "kind" $hit.Kind -}}
    {{- .Scratch.SetInMap "temp" "lang" $hit.Lang -}}
    {{- .Scratch.SetInMap "temp" "lastmod" $hit.Lastmod.UTC.Unix -}}
    {{- .Scratch.SetInMap "temp" "permalink" $hit.Permalink -}}
    {{- .Scratch.SetInMap "temp" "publishdate" $hit.PublishDate -}}
    {{- .Scratch.SetInMap "temp" "readingtime" $hit.ReadingTime -}}
    {{- .Scratch.SetInMap "temp" "relpermalink" $hit.RelPermalink -}}
    {{- .Scratch.SetInMap "temp" "summary" $hit.Summary -}}
    {{- .Scratch.SetInMap "temp" "title" $hit.Title -}}
    {{- .Scratch.SetInMap "temp" "type" $hit.Type -}}
    {{- .Scratch.SetInMap "temp" "url" $hit.Permalink -}}
    {{- .Scratch.SetInMap "temp" "weight" $hit.Weight -}}
    {{- .Scratch.SetInMap "temp" "wordcount" $hit.WordCount -}}
    {{- .Scratch.SetInMap "temp" "section" $hit.Section -}}
    {{/* Include valid page vars */}}
    {{- range $key, $param := (.Scratch.Get "temp") -}}
      {{- if in $validVars $key -}}
        {{- $dot.Scratch.SetInMap $hit.File.Path $key $param -}}
      {{- end -}}
    {{- end -}}
    {{/* Include valid page params */}}
    {{- range $key, $param := $hit.Params -}}
      {{- if in $validParams $key -}}
        {{- $dot.Scratch.SetInMap $hit.File.Path $key $param -}}
      {{- end -}}
    {{- end -}}
    {{- $.Scratch.SetInMap "hits" $hit.File.Path (.Scratch.Get $hit.File.Path) -}}
  {{- end -}}
{{- end -}}
{{- jsonify ($.Scratch.GetSortedMapValues "hits") -}}

What this does is parse through the pages content in content/blog and builds up the attributes from this info. It then loops through the vars and params items you defined in config.toml to filter things before outputting the result as JSON in public/algolia.json. You can change the attributes used, run a build and then open the resulting index file in your public folder to see what’s changed.

Play around until you get the sort of search content you’re expecting. If you do add params to your config.toml, make sure the same attribute exists in the list.algolia.json index configuration too. Any errors in the console should point you in the right direction.

Updating Algolia

First off, you need an account from Algolia of course. They offer free accounts (no credit card needed) for up to 10,000 items in your index and 10,000 search queries per month. For many sites this is probably enough to at least get started. Above those numbers you’re charged US$1.00 for blocks of additional 1,000 units.

As part of the on-boarding workflow for your new account, you’ll be asked to create an application and an associated index name. You can use whatever makes sense for you in these names, but the index name will be used in your front end code later.

Once you have your application and index created in Algolia, you need to add information to your index. If you’ve completed the Hugo config above and have a suitable JSON file in your public folder, you should be good to go.

To add your index file, open up the correct application and index config in Algolia and use the Add Records option as below to upload your generated public/algolia.json index file.

Add records to your first index by uploading the index file

Your first index has now been loaded and you can use the Search field in the Browse tab on that same Algolia page to practice some queries on your data. This simulates the results and the content that will be returned by an API call from your Hugo site.

If your indexed data isn’t quite doing what you want, you can use the Manage Index tab to clear the index (not delete it) and upload a new index file with your content. Just keep refining things until the right data and attributes are there and work as you expect.

Updating Algolia with Hugo Build

You don’t want to upload a new index file each time you create or modify content. Using an additional NPM package called atomic-algolia your build process can selectively update the Algolia index for only the content that’s changed using the Algolia API. This helps automate things as well as keeping down the number of changes that can see our usage creeping up and perhaps out of the free tier.

Install atomic-algolia as per the NPM instructions from the link above. In your package.json file in the root of your project you need to add the following to the scripts section. This provides the npm script option to just do the index updates.

"scripts": {
    "algolia": "atomic-algolia"
}

So that the process knows about your Algolia information and the index to target, create a .env file in the root of your project and populate it as below. You can get your app and API key information from the Algolia web site by clicking on your profile in the top right and then selecting Settings and API Keys.

ALGOLIA_APP_ID=<YOUR_APP_ID>
ALGOLIA_ADMIN_KEY=<YOUR_ADMIN_KEY>
ALGOLIA_INDEX_NAME=<YOUR_INDEX_NAME>
ALGOLIA_INDEX_FILE=public/algolia.json

Give things a test by running npm run algolia and the output is expected to look like the below (if there were no changes). Note that drafts don’t get included. If there are errors in the output, check that your index file is being found and that you have ObjectID defined in your public/algolia.json output.

> blog@1.0.0 algolia /home/your_name/projects/blog
> atomic-algolia

[Algolia] Adding 0 hits to blog
[Algolia] Updating 0 hits to blog
[Algolia] Removing 0 hits from blog
[Algolia] 12 hits unchanged in blog
{}

Hooray, your index is now updated in an automated fashion and you can get back to creating your content.

Next Time

Right now you should have a working index file that you can search when logged into Algolia. In the next part of this short series we’ll add the front end HTML and JS to embed the search UI into your static site and make API calls to Algolia.

Last updated on