Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special characters in Japanese translation via API #58

Open
kolodko-02 opened this issue Jan 15, 2025 · 0 comments
Open

Special characters in Japanese translation via API #58

kolodko-02 opened this issue Jan 15, 2025 · 0 comments

Comments

@kolodko-02
Copy link

When translating text from English to Japanese using the DeepL API, if the English text contains German words with special characters (e.g., ä in "Auswärtiges"), these characters are converted into HTML entities (e.g., ä) in the translation output. This behavior occurs even though I am using tag_handling set to xml and specifying ignore_tags.

Input Text (English):
You should use this text as a general guide, but it cannot be a legal consultation. The visa regulations are made and executed by the Federal Foreign Office (Auswärtiges Amt), Federal Ministry of the Interior (BMI), and the local Foreigners’ Offices (Ausländerbehörde). Please be aware: Some rules and procedures can vary from embassy/consulate to embassy/consulate (even within one country), and from Ausländerbehörde to Ausländerbehörde in Germany. That is why we highly recommend you also consult the German representation abroad near you, and the local Foreigners’ Office in Germany (which Ausländerbehörde is responsible for you depends on your residence in Germany) for information that applies to your case.

Actual Output:
このテキストは一般的なガイドとして使用できますが、法的な相談にはなりません。ビザの規制は、外務省(Auswärtiges Amt)、内務省(BMI)、および現地の外国人局(Ausländerbehörde)によって作成および実行されます。ご注意ください: 一部の規則および手続きは、大使館/領事館間(同じ国内でも)およびドイツのAusländerbehörde間で異なる場合があります。そのため、最寄りのドイツの国外代表部およびドイツでの現地の外国人局(あなたに責任を持つAusländerbehördeはドイツでの居住地によります)にも相談することを強くお勧めします。

Here is the PHP code I am using to make the API request:
$response = $this->deeplApiServiceTranslator->translateText(
$text,
$sourceLanguage, // 'EN'
$language, // 'JA'
[
'tag_handling' => 'xml',
'ignore_tags' => 'keep',
'formality' => 'prefer_less',
'glossary' => $glossaryId,
'model_type' => 'prefer_quality_optimized',
]
);

where deeplApiServiceTranslator is DeepL\Translator

Enciding after translation does not help:
$field['field_content'] = mb_convert_encoding($field['field_content'], 'UTF-8', 'auto');

How can I prevent the DeepL API from converting special characters in German words to HTML entities during translation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant