Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A collection of comments on Version 1.0.0 with Errata 1 #331

Open
45 tasks
ajul opened this issue Aug 4, 2022 · 1 comment
Open
45 tasks

A collection of comments on Version 1.0.0 with Errata 1 #331

ajul opened this issue Aug 4, 2022 · 1 comment

Comments

@ajul
Copy link

ajul commented Aug 4, 2022

Thank you for the spec. For the most part I found it to progress logically, though I did have some comments along the way:

  • General: Consider using a single phrasing out of "defined as", "defined as follows", "specified as", "specified as follows".
  • General: Consider clarifiying the difference between "must", "shall", and "it is a requirement of bitstream conformance that".
  • 3. Symbols and abbreviated terms: IS_INTER_CONTEXTS and SKIP_CONTEXTS are repeated in the table.
  • 3. Symbols and abbreviated terms: Should there be a space between "mv" and the source of the mv, e.g. NEWMV or NEW_MV? Constant names and variables are not currently consistent on this, e.g. COMP_NEWMV_CTXS vs. NEW_MV_CONTEXTS. This appears in other parts of the spec as well.
  • 3. Symbols and abbreviated terms: The motion modes (SIMPLE, OBMC, LOCALWARP) are also defined in 6.10.26. Read motion mode semantics. IMO only one set is necessary.
  • 3. Symbols and abbreviated terms: Consider defining MAX_PLANES = 3.
  • 3. Symbols and abbreviated terms: MB_MODE_COUNT appears to be unused.
  • 3. Symbols and abbreviated terms: TX_CLASS_HORIZ would be more consistent with other naming as TX_CLASS_HORZ.
  • 5.3.2. OBU header syntax: In the case that obu_extension_flag == 0, temporal_id and spatial_id could explicitly be set to 0 here.
  • 5.8.2. Metadata ITUT T35 syntax: ITUT or ITU_T?
  • 5.9.11. Loop filter params syntax: Consider resetting loop_filter_ref_deltas in index order.
  • 5.9.22. Skip mode params syntax: According to the order these are called, this should be after 5.9.23. Frame reference mode syntax.
  • 5.11.4. Decode partition syntax: The value of return 0 is unused.
  • 5.11.7. Intra frame mode info syntax: Apart from YMode, the last case is the same as intra_block_mode_info(). Consider calling intra_block_mode_info()` in this case.
  • 5.11.15. TX size syntax: The last for loop is missing the opening bracket.
  • 5.11.34. Residual syntax: The variables sbMask, subBlockMiRow, and subBlockMiCol seem to be unused.
  • 5.11.48. Get transform set function: TX_SET_INTER_3 uses a single clause with an ||, but TX_SET_INTRA_2 uses two separate clauses.
  • 5.11.49. Palette tokens syntax: Consider updating blockHeight etc. before reading color_index_map_uv. This will allow implementers to initialize ColorMapUV with the right size.
  • 5.11.53. Clamp MV row function: Consider putting this in 7.10.2. Find MV stack process instead.
  • 5.11.54. Clamp MV col function: Same.
  • 5.11.58. Read loop restoration unit syntax: WIENER_COEFFS could be used instead of 3 in the loop bound of j and the size of Wiener_Taps_Mid, Wiener_Taps_Min, Wiener_Taps_Max, Wiener_Taps_K.
  • 6.2.4. Trailing bits semantics: "When the syntax element trailing_one_bit is read, it is a requirement that nbBits is greater than zero." -> " it is a requirement of bitstream conformance that"
  • 6.4.1. General sequence header OBU semantics: "If operating_point_idc[ op ] is not equal to 0 for any value of op from 0 to operating_points_cnt_minus_1, it is a requirement of bitstream conformance that obu_extension_flag is equal to 1.": This appears to be permitted by the reference implementation?
  • 6.4.2. Color config semantics: "If matrix_coefficients is equal to MC_IDENTITY, it is a requirement of bitstream conformance that subsampling_x is equal to 0 and subsampling_y is equal to 0." This seems like it should not apply to the monochrome case?
  • 6.4.4. Decoder model info semantics: "num_units_in_decoding_tick shall be greater than 0": Could this use the standard phrasing "It is a requirement of bitstream conformance that num_units_in_decoding_tick is greater than 0?"
  • 6.8.2. Uncompressed header semantics: "and that RefValid[frame_to_show_map_idx ] is equal to 1": It seems clearer to list this requirement under frame_to_show_map_idx. Otherwise there is an implication that this requirement only applies when display_frame_id is read.
  • 6.8.2. Uncompressed header semantics: The order in which init_non_coeff_cdfs presents the CDFs is not the same as 9.4. Default CDF tables, it would be nice if they were the same.
  • 6.8.2. Uncompressed header semantics: ref_frame_idx has 7 elements and is indexed by ref - LAST_FRAME; gm_params and MotionFieldMvs have 8 elements and are indexed by ref directly, leaving index 0 unused. Why the difference?
  • 6.8.2. Uncompressed header semantics: "It is a requirement of bitstream conformance that when show_existing_frame is used to show a previous frame, that the value of showable_frame for the previous frame was equal to 1." is a bit confusing; this should spell out that "previous frame" refers to frame_to_show_map_idx, like the following paragraph.
  • 6.8.10. Loop filter semantics: "If this syntax element is not present in the, it maintains its previous value." has an extra "in the".
  • 6.8.10. Loop filter semantics: "intially set by the setup_past_independence" -> "initially"
  • 6.10.1. General tile group OBU semantics: "It is a requirement of bitstream conformance that the value of tg_end for the last tile group in each frame is equal to NumTiles - 1." What if there are no tile groups at all? What if the stream ends in the middle of a frame? This check could be based on TileNum rather than tg_end and occur at the next temporal delimiter or frame header. Or say that a temporal unit can't end in the middle of a frame.
  • 6.10.12. Quantizer index delta semantics: DELTA_Q_SMALL would make more sense to be named DELTA_Q_LARGE, since it's the large values that use the alternative encoding.
  • 6.10.13. Loop filter delta semantics: Same for DELTA_LF_SMALL.
  • 6.10.36. Read CFL alphas semantics: Consider using the same or similar sign enum for DC sign as well.
  • 6.10.37. Palette mode info semantics: "Note: Luma and U delta values give a positive offset relative to the previous palette entry in the same plane. V delta values give a signed offset relative to the U palette entries": U delta is only guaranteed to be non-negative. "The palette colors are generated in ascending order" does not apply to V.
  • 7.7. Frame end update CDF process: Consider a different prefix than "Saved" for these CDF tables. There is only one set of these CDFs, and they are only relevant until the end of the frame, unlike other "saved" variables which have 8 slots and persist across multiple frames.
  • 7.9.1. General: Consider making refStamp one greater and check whether it is > 0. This would represent the remaining space left in the stack.
  • 7.9.4. Get block position process: Consider "x8 and y8" -> "y8 and x8".
  • 7.10.2.9. Compound search stack process: candSize is unused. Also, consider moving has_newmv( candMode ) earlier to match the non-compound version.
  • 8.3.2. Cdf selection process: Consider listing all non-coeff elements before all coeff elements (or vice-versa) rather than putting the coeff elements in the middle.
  • 8.3.2. Cdf selection process: The tables Sig_Ref_Diff_Offset and Palette_Color_Context appear to only be used in CDF selection; consider moving them to this section.
  • 8.3.2. Cdf selection process: The CDF selection process for coeff_br could use TX_CLASS_2D instead of 0, etc.
  • 8.3.2. Cdf selection process: The CDF selection process for interp_filter could use BILINEAR instead of 3.
  • 9.5.2. Derivation process (Informative): This should note that the width and height are treated as no greater than 32, including for determining which fundamentalMatrix to use.

I hope this is helpful.

@peterderivaz
Copy link
Collaborator

These all look like excellent improvements, thanks so much for spending the time to find them!

Based on previous issues, I suspect AOMedia will be reluctant to make any purely cosmetic changes to the AV1 spec (even when they are clear improvements), but very interested in making the corresponding improvements to the AV2 specification.

I don't think the AV2 specification has been written yet, but quite a few of the comments (e.g. about consistent naming) are inherited from the source code so you could consider submitting a pull request to the libaom source code first. (For example, it would seem a bit odd for the specification to call things DELTA_Q_LARGE if the source code still called it DELTA_Q_SMALL, or to use a variable like refStamp with the same name, but with values off by 1.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants