Skip to content

Conversation

@eschabell
Copy link
Collaborator

@eschabell eschabell commented Dec 12, 2025

  • Change Generic.Encoding to generic.encoding to match source
  • Change Unicode.Encoding to unicode.encoding to match source
  • Fix East asian to East Asian capitalization

Fixes #2202.

Summary by CodeRabbit

  • Documentation
    • Renamed encoding parameters for consistency (updated examples and config samples)
    • Normalized headings and casing (e.g., East Asian encodings)
    • Updated guidance and examples for encoding options and aliases across regions
    • Minor wording and formatting improvements; documentation-only changes, no behavioral impact

✏️ Tip: You can customize this high-level summary in your review settings.

- Change Generic.Encoding to generic.encoding to match source
- Change Unicode.Encoding to unicode.encoding to match source
- Fix East asian to East Asian capitalization

Fixes fluent#2202.

Signed-off-by: Eric D. Schabell <[email protected]>
@eschabell eschabell self-assigned this Dec 12, 2025
@eschabell eschabell requested review from a team as code owners December 12, 2025 14:45
@eschabell eschabell added the waiting-on-review Waiting on a review from mainteners label Dec 12, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 12, 2025

Walkthrough

Documentation-only updates to the Tail input guide: parameter name normalization (Generic.Encoding → generic.encoding, Unicode.Encoding → unicode.encoding), headline casing fixes, and expanded/clarified encoding lists and examples in pipeline/inputs/tail.md.

Changes

Cohort / File(s) Summary
Tail input documentation
pipeline/inputs/tail.md
Renamed parameters from Generic.Encodinggeneric.encoding and Unicode.Encodingunicode.encoding; normalized headline casing (e.g., "East asian" → "East Asian"); updated examples, configuration blocks, informational hints, and expanded encoding lists/aliases.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

  • Single-file, documentation-only changes with consistent, repetitive edits.
  • Review focus: confirm parameter name consistency, examples reflect new names, and no leftover capitalization variants.

Possibly related PRs

Suggested reviewers

  • alexakreizinger
  • cosmo0920

Poem

🐇 I hopped through lines of docs today,
Swapped names and fixed the caps in play,
generic.encoding now gleams, unicode too,
East Asian lists updated — tidy and true.
A happy tail of docs, succinct and new.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR addresses only parameter casing and style fixes but does not implement the primary objective of issue #2202: documenting the listed undocumented Tail input settings. Add documentation for the eight undocumented settings identified in issue #2202 (read_newly_discovered_files_from_head, watcher_interval, progress_check_interval, progress_check_interval_nsec, fstat_interval, event_batch_size, truncate_long_lines, docker_mode_parser).
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: fixing parameter case (Generic.Encoding/Unicode.Encoding) and fixing style (East asian capitalization).
Out of Scope Changes check ✅ Passed All changes are within scope: parameter casing corrections and style fixes align with the PR title and the referenced issue context of improving Tail input documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
pipeline/inputs/tail.md (3)

498-512: Inconsistent parameter naming in configuration parameter descriptions.

The parameter names in this section use capitalized forms (Unicode.Encoding and Generic.Encoding), but the configuration table (lines 27 and 50) documents them as lowercase (generic.encoding and unicode.encoding). Update these references for consistency:

-1. `Unicode.Encoding`
+1. `unicode.encoding`

   Use this parameter for high-performance conversion of UTF-16 encoded logs to UTF-8. This method utilizes modern processor features (SIMD instructions) to accelerate the conversion process, making it highly efficient.

   - Use Case: Ideal for logs coming from modern Windows environments that default to UTF-16.
   - Supported Values:
     - `UTF-16LE` (Little-Endian)
     - `UTF-16BE` (Big-Endian)

-1. `Generic.Encoding`
+1. `generic.encoding`

   Use this parameter to convert from a wide variety of other character encodings, particularly legacy Windows code pages.

   - Use Case: Essential for logs from older systems or applications configured for specific regions, common in East Asia and Eastern Europe.

92-96: Inconsistent parameter naming: change Unicode.Encoding to unicode.encoding to match the parameter table.

Lines 93 and 95 reference the parameter as Unicode.Encoding (capitalized), but the configuration table (line 50) documents it as unicode.encoding (lowercase). Update these references to match the canonical lowercase form used in the parameter table.

-The `Unicode.Encoding` parameter is dependent on the `simdutf` library, which is itself dependent on C++ version 11 or later. In environments that use earlier versions of C++, the `Unicode.Encoding` parameter will fail.
+The `unicode.encoding` parameter is dependent on the `simdutf` library, which is itself dependent on C++ version 11 or later. In environments that use earlier versions of C++, the `unicode.encoding` parameter will fail.

-Additionally, the `auto` setting for `Unicode.Encoding` isn't supported in all cases, and can make mistakes when it tries to guess the correct encoding. For best results, use either the `UTF-16LE` or `UTF-16BE` setting if you know the encoding type of the target file.
+Additionally, the `auto` setting for `unicode.encoding` isn't supported in all cases, and can make mistakes when it tries to guess the correct encoding. For best results, use either the `UTF-16LE` or `UTF-16BE` setting if you know the encoding type of the target file.

Note: Similar inconsistencies exist elsewhere in the file (lines 498, 507, 539, 559) with capitalized Generic.Encoding and Unicode.Encoding that should also be updated to match the lowercase parameter table entries.


539-560: Fix parameter casing in .conf example to match documented parameter name.

Line 559 shows Generic.Encoding in the .conf example, but the documented parameter name (line 526, parameter table) is generic.encoding (lowercase). Since Fluent Bit parameters are case-sensitive, update the .conf example to use the correct casing:

[INPUT]
    Name                tail
    Path                C:\path\to\your\sjis.log
-   Generic.Encoding    ShiftJIS
+   generic.encoding    ShiftJIS
🧹 Nitpick comments (1)
pipeline/inputs/tail.md (1)

11-51: Missing parameter documentation: fstat_interval from issue #2202.

Issue #2202 identifies eight undocumented settings. This PR addresses seven of them (lines 23, 37–40, 49–51), but fstat_interval is still missing from the configuration table. Please add documentation for this parameter or confirm it is intentionally omitted.

Do you want me to help add the missing fstat_interval parameter documentation once you confirm the parameter details (description and default value)?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92fb0e1 and c5248e4.

📒 Files selected for processing (1)
  • pipeline/inputs/tail.md (3 hunks)
🧰 Additional context used
🪛 GitHub Check: runner / vale
pipeline/inputs/tail.md

[warning] 514-514:
[vale] reported by reviewdog 🐶
[FluentBit.Headings] 'East Asian encodings' should use sentence-style capitalization.

Raw Output:
{"message": "[FluentBit.Headings] 'East Asian encodings' should use sentence-style capitalization.", "location": {"path": "pipeline/inputs/tail.md", "range": {"start": {"line": 514, "column": 5}}}, "severity": "INFO"}

🔇 Additional comments (1)
pipeline/inputs/tail.md (1)

514-514: Approved: Proper heading capitalization despite Vale warning.

Line 514 correctly uses title-case capitalization for "East Asian encodings," which is semantically appropriate for a region and language group reference. The Vale linter suggests sentence-style capitalization ("East asian"), but the PR's approach is the correct standard for documentation headings. No change needed.

- Change Generic.Encoding to generic.encoding in text and examples
- Change Unicode.Encoding to unicode.encoding in text and examples

Applies to fluent#2202.

Signed-off-by: Eric D. Schabell <[email protected]>
@eschabell
Copy link
Collaborator Author

@alexakreizinger please review!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pipeline/inputs/tail.md (1)

539-560: Fix inconsistent casing in fluent-bit.conf example (still uses Generic.Encoding). This contradicts the doc/table and the PR objective. Suggested patch:

 [INPUT]
     Name                tail
     Path                C:\path\to\your\sjis.log
-    Generic.Encoding    ShiftJIS
+    generic.encoding    ShiftJIS
🧹 Nitpick comments (1)
pipeline/inputs/tail.md (1)

498-513: Encoding parameter section updates are consistent with the renames. Minor: both list items are numbered 1.—Markdown will render fine, but consider 1. / 2. if your style guide prefers explicit numbering.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5248e4 and 062dcd8.

📒 Files selected for processing (1)
  • pipeline/inputs/tail.md (6 hunks)
🧰 Additional context used
🪛 GitHub Check: runner / vale
pipeline/inputs/tail.md

[warning] 514-514:
[vale] reported by reviewdog 🐶
[FluentBit.Headings] 'East Asian encodings' should use sentence-style capitalization.

Raw Output:
{"message": "[FluentBit.Headings] 'East Asian encodings' should use sentence-style capitalization.", "location": {"path": "pipeline/inputs/tail.md", "range": {"start": {"line": 514, "column": 5}}}, "severity": "INFO"}

🔇 Additional comments (4)
pipeline/inputs/tail.md (4)

27-27: Keep generic.encoding naming consistent across the doc (table looks good). The rename aligns with the PR objective; supported-values list also matches the later “legacy code pages” section.


50-50: unicode.encoding rename looks consistent and clear.


514-514: Resolve heading capitalization: PR intent vs Vale rule conflict. PR changes to “East Asian” (arguably correct), but Vale requests sentence-style capitalization. Either (a) keep “East Asian” and adjust/override the Vale rule, or (b) change back to satisfy Vale (but that partially negates the PR objective).


92-96: Verify and clarify the build-time nature of the simdutf dependency. The claim about simdutf requiring C++11 is accurate, but the documentation should note that Unicode.Encoding availability is build-time dependent (controlled by FLB_USE_SIMDUTF/FLB_UNICODE_ENCODER CMake flags). Users whose Fluent Bit binary lacks C++11 support compiled out likely won't have the Unicode.Encoding feature available at all, rather than failing at runtime. Consider linking to the CMake build documentation or specifying the minimum version (v4.1+) where this feature is present.

Copy link
Member

@alexakreizinger alexakreizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting-on-review Waiting on a review from mainteners

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Undocumented secret settings for Input Tail

2 participants