-
Notifications
You must be signed in to change notification settings - Fork 1.7k
unicode: defs: conv: Implement conversion rules of character encodings #10464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cosmo0920
wants to merge
5
commits into
master
Choose a base branch
from
cosmo0920-implement-conversion-rules-of-character-encodings
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
unicode: defs: conv: Implement conversion rules of character encodings #10464
cosmo0920
wants to merge
5
commits into
master
from
cosmo0920-implement-conversion-rules-of-character-encodings
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
68b218b
to
9aca9bf
Compare
3158015
to
c314755
Compare
c314755
to
fe52a3e
Compare
The failure seems to be weird flake in the
As far as I can tell it is unrelated to the code in this PR. |
Yes, that test case is flaky test on macOS. Thanks for pointing out. |
Signed-off-by: Hiroshi Hatake <[email protected]>
Note: The rules which are related to CJK is mainly included for this converter implementation on Fluent Bit. Signed-off-by: Hiroshi Hatake <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
Signed-off-by: Hiroshi Hatake <[email protected]>
…e characters Signed-off-by: Hiroshi Hatake <[email protected]>
d01559e
to
4898c2a
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I implemented an encoding conversion engine on Fluent Bit which supports the following encodings:
East Asian Encodings
ShiftJIS
: 932GB18030
: 54936GBK
: 936UHC
(Unified Hangul Code): 949Big5
: 950Windows (ANSI) Encodings
Win1250
(Central European): 1250Win1251
(Cyrillic): 1251Win1252
(Western European / Latin): 1252Win1253
(Greek): 1253Win1254
(Turkish): 1254Win1255
(Hebrew): 1255Win1256
(Arabic): 1256DOS (OEM) Encodings
Win866
(Cyrillic - DOS): 866Win874
(Thai): 874This is because especially CJK language environments, converting encoding is quite important because the binary data of these encodings of frequently used is not always compatible against UTF-8 way of representation of characters.
Plus, this could be a first milestone to remove the obstacles to move from other log collectors which already support CJK related character encodings.
This type of capability is really really important for Asian languages especially for CJK environments.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Just only logs for added assertion case of unit tests with valgrind:
No leaks is detected.
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.