Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: indexing for chinese #370

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,13 @@ And here are the current Ghost events / webhooks and their endpoints:
| Post unpublished | .../ghost/delete-record |
| Post deleted | .../ghost/delete-record |

Finally, here are the currently configured Algolia indices for each publication:
Finally, here are the currently configured Algolia indices for each publication, which are all production unless otherwise noted:

| Publication URL | CMS | Algolia index |
| --------------------------------------------- | -------- | ------------- |
| https://www.freecodecamp.org/news/ | Hashnode | news |
| https://chinese.freecodecamp.org/news/ | Ghost | news-zh |
| http://localhost:3030 | Ghost | news-es (dev) |
| https://www.freecodecamp.org/chinese/news/ | Ghost | news-zh |
| https://www.freecodecamp.org/espanol/news/ | Ghost | news-es |
| https://www.freecodecamp.org/italian/news/ | Ghost | news-it |
| https://www.freecodecamp.org/japanese/news/ | Ghost | news-ja |
Expand Down
52 changes: 33 additions & 19 deletions lib/utils/helpers.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,14 @@ export const algoliaClient = algoliasearch(
process.env.ALGOLIA_ADMIN_KEY
);

export const searchIndexNameMap = {
'http://localhost:3030': 'news-es', // Dockerized Spanish Ghost instance for testing, which is mimicking the old English instance
'https://chinese.freecodecamp.org/news/': 'news-zh',
const searchIndexNameMap = {
// Dockerized Spanish Ghost instance for testing
'http://localhost:3030': 'news-es',
// The Chinese publication is at https://chinese.freecodecamp.org/news/,
// but this map and the getSearchIndexName function are used after formatting
// the post for Algolia, which updates necessary URLs for the search record
// to start with https://www.freecodecamp.org/chinese/news/
'https://www.freecodecamp.org/chinese/news/': 'news-zh',
'https://www.freecodecamp.org/espanol/news/': 'news-es',
'https://www.freecodecamp.org/italian/news/': 'news-it',
'https://www.freecodecamp.org/japanese/news/': 'news-ja',
Expand Down Expand Up @@ -96,6 +101,30 @@ export const formatHashnodePost = (post) => {
};
};

export const getBaseSiteURL = (url) => {
const URLObj = new URL(url);
const { host, pathname } = URLObj;
const pathParts = pathname.split('/').filter(Boolean);
let siteLang;
if (host.startsWith('chinese')) {
siteLang = 'chinese';
} else if (pathParts.length === 3) {
// Webhooks will only be triggered by posts, so the path will
// always have 3 parts for our localized instances:
// (/<lang>/news/<slug>/).
// Or if it's coming from a Dockerized test instance / localhost,
// it will only have 1 part: (/<slug>/).
siteLang = pathParts[0];
}
const computedPath = siteLang ? `/${siteLang}/news/` : '/news/';

if (host.startsWith('localhost')) {
return `http://${host}${computedPath}`;
} else {
return `https://www.freecodecamp.org${computedPath}`;
}
};

export const formatGhostPost = (post) => {
const {
id,
Expand All @@ -107,22 +136,7 @@ export const formatGhostPost = (post) => {
feature_image,
published_at
} = post;
const URLObj = new URL(url);
const { href, origin, pathname } = URLObj;
const pathParts = pathname.split('/').filter(Boolean);
let siteLang;
if (href.startsWith('https://chinese.freecodecamp.org/')) {
siteLang = 'chinese';
} else if (pathParts.length === 3) {
// Webhooks will only be triggered by posts, so the path will
// always have 3 parts for our localized instances:
// (/<lang>/news/<slug>/).
// Or if it's coming from a Dockerized test instance, it will
// only have 1 part: (/<slug>/).
siteLang = pathParts[0];
}
const siteURL = `${origin}/${siteLang ? `${siteLang}/` : ''}news/`;
console.log({ siteURL, siteLang, href, origin, pathname, pathParts });
const siteURL = getBaseSiteURL(url);

return {
objectID: id,
Expand Down
19 changes: 13 additions & 6 deletions packages/ghost/delete-record/index.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
// The ../../../lib/utils directory is zipped in the same directory
// as the function during the build process
import { algoliaClient, getSearchIndexName } from './utils/helpers.js';
import {
algoliaClient,
getBaseSiteURL,
getSearchIndexName
} from './utils/helpers.js';

export const deleteRecord = async (req) => {
try {
Expand All @@ -15,13 +19,16 @@ export const deleteRecord = async (req) => {
// Whether a published post is unpublished or deleted, the
// status will be 'published' in the previous state
if (prevState.status === 'published') {
// Deleted posts don't include a url. But since every
// post must include at least one author, set the index
// based on the primary author page url instead
const primaryAuthorUrl = prevState.authors
// Deleted posts don't include a url or a primary author object.
// But since every post returns an author array, we can use
// the first author object to determine the search index name.
// This helps since we're handling data from localhost,
// chinese.freecodecamp.org, and www.freecodecamp.org.
const primaryAuthorURL = prevState.authors
? prevState.authors[0].url
: currState.authors[0].url;
const indexName = getSearchIndexName(primaryAuthorUrl);
const siteURL = getBaseSiteURL(primaryAuthorURL);
const indexName = getSearchIndexName(siteURL);

if (!indexName) {
throw new Error('No matching index found for the current post');
Expand Down