Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(route): add route for university: CNU JWC #17709

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions lib/routes/cnu/jwc.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
import { Route } from '@/types';
import { parseDate } from '@/utils/parse-date';
import got from '@/utils/got';
import { load } from 'cheerio';
import cache from '@/utils/cache';

const BASE_URL = 'https://jwc.cnu.edu.cn/tzgg/index.htm';

export const route: Route = {
path: '/jwc',
categories: ['university'],
example: '/cnu/jwc',
radar: [
{
source: [new URL(BASE_URL).host],
},
],
name: '首都师范大学教务处',
TonyRL marked this conversation as resolved.
Show resolved Hide resolved
TonyRL marked this conversation as resolved.
Show resolved Hide resolved
maintainers: ['Aicnal'],
handler,
url: new URL(BASE_URL).host + new URL(BASE_URL).pathname, // host + pathname
};

async function handler() {
const response = await got({ method: 'get', url: BASE_URL });
const $ = load(response.data);

const list = $('li')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

li include way too many unrelated elements

.toArray() // Convert to an array first
.map((e) => {
const element = $(e);
const rawTitle = element.find('a').text().trim();
const dateRegex = /^(\d{1,2})\s+(\d{4})-(\d{1,2})/;
const match = rawTitle.match(dateRegex);

if (!match) {
return null;
}

const [, day, year, month] = match;
const pubDate = parseDate(`${year}-${month}-${day}`, 'YYYY-MM-DD');
const title = rawTitle
.replace(dateRegex, '')
.trim()
.replaceAll(/(公众|教师|学生)/g, '')
.trim();
const href = element.find('a').attr('href') ?? '';
const link = href.startsWith('http') ? href : new URL(href, BASE_URL).href;

return { title, link, pubDate };
})
.filter(Boolean);

const items = await Promise.all(
list.map((item) =>
cache.tryGet(item.link, async () => {
// Cache the detail page
const detailResponse = await got({ method: 'get', url: item.link });
const content = load(detailResponse.data);
const paragraphs = content(
'body p:not(:contains("分享到:")):not(:contains("版权所有")):not(:contains("地址:")):not(:contains("E-mail:")):not(:contains("网站地图")):not(:contains("ICP备")):not(:contains("京公网安备"))'
)
.toArray()
.map((el) => content(el).html()?.trim())
.join('<br/><br/>');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'body p:not(:contains("分享到:")):not(:contains("版权所有")):not(:contains("地址:")):not(:contains("E-mail:")):not(:contains("网站地图")):not(:contains("ICP备")):not(:contains("京公网安备"))'
)
.toArray()
.map((el) => content(el).html()?.trim())
.join('<br/><br/>');
'.article02'
).html();

Simply select element with class name article02 so that it won't loop through every p element


return {
...item,
description: paragraphs || '暂无内容',
};
})
)
);

return {
title: '首都师范大学教务信息',
link: BASE_URL,
description: '首都师范大学教务处的最新通知公告',
item: items,
};
}
7 changes: 7 additions & 0 deletions lib/routes/cnu/namespace.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
import type { Namespace } from '@/types';

export const namespace: Namespace = {
name: '首都师范大学',
url: 'cnu.edu.cn',
lang: 'zh-CN',
};