Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable inlining of Scala library methods. #9548

Closed
wants to merge 1 commit into from

Conversation

olafurpg
Copy link

@olafurpg olafurpg commented Nov 3, 2020

Previously, the Scala compiler was configured to inline all usages of
symbols in under the scala package. This is problematic because it
means that published Kafka jars must run with the exact same version of
the Scala library that the Kafka jar was compiled with. If the runtime
uses a different version of the Scala library then users risk getting a
crash like this:

java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7

This commit disables inlining from the scala package to prevent
crashes like this. The downside to this change is it may introduce
performance regressions.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Previously, the Scala compiler was configured to inline all usages of
symbols in under the `scala` package. This is problematic because it
means that published Kafka jars must run with the exact same version of
the Scala library that the Kafka jar was compiled with.  If the runtime
uses a different version of the Scala library then users risk getting a
crash like this:

```
java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7
```

This commit disables inlining from the `scala` package to prevent
crashes like this. The downside to this change is it may introduce
performance regressions.
@olafurpg olafurpg mentioned this pull request Nov 3, 2020
3 tasks
@ijuma
Copy link
Contributor

ijuma commented Nov 3, 2020

@olafurpg Thanks for the PR. Can you please elaborate why there are conflicting versions of the Scala library when the core jar is used? The core jar doesn't export a public API and the assumption was that the same Scala library would be used at runtime.

@olafurpg
Copy link
Author

olafurpg commented Nov 3, 2020

Even if users can't directly reference APIs of the core Kafka jar, the bytecode of the core jar is still evaluated at runtime. With the inliner enabled, the core Kafka jar directly references private anonymous classes from scala-library:2.12.11 that are no longer present in scala-library:2.12.12. For example, assuming you have cs (https://get-coursier.io/) installed, observe that scala/math/Ordering$$anon$7 is referenced here below

❯ javap -cp $(cs fetch org.apache.kafka:kafka_2.12:2.5.1 -p) -v kafka.api.ApiVersion$ | grep Ordering\$\$anon
   #24 = Utf8               scala/math/Ordering$$anon$7
   #25 = Class              #24           // scala/math/Ordering$$anon$7
   #84 = Methodref          #25.#83       // scala/math/Ordering$$anon$7."<init>":(Lscala/math/Ordering;Lscala/Function1;)V
        21: new           #25                 // class scala/math/Ordering$$anon$7
        27: invokespecial #84                 // Method scala/math/Ordering$$anon$7."<init>":(Lscala/math/Ordering;Lscala/Function1;)V
     public final #25; //class scala/math/Ordering$$anon$7

The scala-library:2.12.11 jar defines this anonymous class but it's missing from scala-library:2.12.12

❯ jar tf $(cs fetch org.scala-lang:scala-library:2.12.11 -p) | grep Ordering\$\$anon
scala/math/Ordering$$anon$1.class
scala/math/Ordering$$anon$10.class
scala/math/Ordering$$anon$11.class
scala/math/Ordering$$anon$12.class
scala/math/Ordering$$anon$13.class
scala/math/Ordering$$anon$14.class
scala/math/Ordering$$anon$15.class
scala/math/Ordering$$anon$16.class
scala/math/Ordering$$anon$17.class
scala/math/Ordering$$anon$18.class
scala/math/Ordering$$anon$19.class
scala/math/Ordering$$anon$2.class
scala/math/Ordering$$anon$6.class
scala/math/Ordering$$anon$7.class
scala/math/Ordering$DoubleOrdering$$anon$9.class
scala/math/Ordering$FloatOrdering$$anon$8.class
scala/math/PartialOrdering$$anon$1.class
❯ jar tf $(cs fetch org.scala-lang:scala-library:2.12.12 -p) | grep Ordering\$\$anon
scala/math/Ordering$$anon$1.class
scala/math/Ordering$$anon$4.class
scala/math/Ordering$$anon$5.class
scala/math/Ordering$$anon$6.class
scala/math/PartialOrdering$$anon$1.class

The related upstream commit appears to be scala/scala@44318c8#diff-3b590b82493a8dbfc177f6113ee07ac4772d666d99fe808c6f6da01227ad1c6d.

My understanding is that the inliner flags are only safe to enable for applications "at the end of the world", not libraries.

@olafurpg
Copy link
Author

olafurpg commented Nov 3, 2020

To put it differently, enabling the inliner allows Kafka to access private symbols from scala-library.jar. The upstream commit scala/scala@44318c8#diff-3b590b82493a8dbfc177f6113ee07ac4772d666d99fe808c6f6da01227ad1c6d is not a binary breaking change since it only modifies private APIs.

@ijuma
Copy link
Contributor

ijuma commented Nov 3, 2020

I understand that, the point is that the core jar is not meant to be a library. It has no public APIs. The clients jar is meant to be a library.

@ijuma
Copy link
Contributor

ijuma commented Nov 3, 2020

To elaborate a bit more, there are two cases where two cases where this can cause problems that I can think of:

  1. If core is used as a library and there is a mismatch between the Scala version used for compilation and runtime.
  2. The broker is deployed in the same process as another application and there is a mismatch between the Scala version used for compilation and runtime.

Neither of those are recommended.

I'm not opposed to changing these settings for now, but I'm also interested in understanding the specifics of why users are doing 1, 2 or something else that is deviating from our expectations.

@mingaliu
Copy link
Contributor

mingaliu commented Nov 3, 2020

@ijuma , here is the callstack, and you can see why Kafka core is pulled in from the client side (due to using KafakEmbedded in testing):

java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7
at kafka.api.ApiVersion$.orderingByVersion(ApiVersion.scala:45)
at kafka.api.ApiVersion.compare(ApiVersion.scala:139)
at kafka.api.ApiVersion.compare$(ApiVersion.scala:138)
at kafka.api.KAFKA_2_5_IV0$.compare(ApiVersion.scala:339)
at kafka.api.KAFKA_2_5_IV0$.compare(ApiVersion.scala:339)
at scala.math.Ordered.$greater$eq(Ordered.scala:91)
at scala.math.Ordered.$greater$eq$(Ordered.scala:91)
at kafka.api.KAFKA_2_5_IV0$.$greater$eq(ApiVersion.scala:339)
at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1529)
at kafka.server.KafkaConfig.<init>(KafkaConfig.scala:1238)
at org.apache.kafka.streams.integration.utils.KafkaEmbedded.<init>(KafkaEmbedded.java:75)
at org.apache.kafka.streams.integration.utils.EmbeddedKafkaCluster.start(EmbeddedKafkaCluster.java:103)
at com.twitter.finatra.kafka.test.EmbeddedKafka.beforeAll(EmbeddedKafka.scala:61)
at com.twitter.finatra.kafka.test.EmbeddedKafka.beforeAll$(EmbeddedKafka.scala:59)
at com.twitter.finatra.kafkastreams.test.AbstractKafkaStreamsFeatureTest.beforeAll(KafkaStreamsFeatureTest.scala:26)
at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
at com.twitter.inject.Test.run(Test.scala:28)
at org.scalatestplus.junit.JUnitRunner.run(JUnitRunner.scala:99)

@olafurpg
Copy link
Author

olafurpg commented Nov 3, 2020

Another related case (akka/alpakka-kafka#1212) is also from using EmbeddedKafka as a library for testing purposes https://github.com/embeddedkafka/embedded-kafka

@ijuma
Copy link
Contributor

ijuma commented Nov 3, 2020

I see, so it's for tests. Note that KafkaEmbedded is not a public API and no backwards compatibility guarantees are offered. The public APIs are all here:

https://kafka.apache.org/26/javadoc/overview-summary.html

Apache Kafka doesn't offer a test api though (I had started https://cwiki.apache.org/confluence/display/KAFKA/KIP-139%3A+Kafka+TestKit+library, but didn't complete it) so I understand why may go down that path.

@olafurpg
Copy link
Author

@ijuma Thank you for the clarification! I wasn't aware that KafkaEmbedded is not a public API with no compatibility guarantees. I'm gonna close this PR since the current behavior appears to be working as intended.

@olafurpg olafurpg closed this Nov 11, 2020
@ijuma
Copy link
Contributor

ijuma commented Nov 11, 2020

@olafurpg I think you should keep this open. I think we may want to apply a change along the lines you submitted here given the example you gave. Even though it's relying on internal APIs, there isn't a good supported alternative. Sorry for the delay, I'll take a closer look in the next few days.

@olafurpg
Copy link
Author

Feel free to commandeer this PR if you still think it’s a good idea. I totally understand the reasoning to keep things unchanged so I’m not sure I’m the best person to push this through. I apologize the noise and thank you for all of your help! 🙏

@mingaliu
Copy link
Contributor

mingaliu commented Dec 2, 2020

Any update? Even KafkaEmbedded is internal but we actually used it in many places for our unitest. So we would love to find a good alternative solutions. (Our current workaround is ugly, as we have to rebuild Kafka core with different scala version).

@ijuma
Copy link
Contributor

ijuma commented Dec 9, 2020

I think I'll have time to look into this later this week.

ijuma added a commit to ijuma/kafka that referenced this pull request Feb 22, 2021
As mentioned in apache#9548, users currently use the kafka jar (`core` module)
for integration testing and the current inlining behavior causes
problems when the user's classpath contains a different Scala version
than the one that was used for compilation.

An example error:

`java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7`

We disable inlining of the `scala` package by default, but make it easy
to enable it for those who so desire (a good option if you can ensure
the scala library version matches the one used for compilation). While
at it, we make it possible to disable scala compiler optimizations
(`none`) or to use only method local optimizations (`method`). This can
be useful if optimizing for compilation time during development.

Verified behavior by running gradlew with `--debug` and checking the
output after `[zinc] The Scala compiler is invoked with:`
@ijuma
Copy link
Contributor

ijuma commented Feb 22, 2021

@mingaliu @olafurpg I submitted #10174. Please take a look.

@ijuma
Copy link
Contributor

ijuma commented Feb 22, 2021

Out of curiosity, are you still using Apache Kafka 2.5.x or have you upgraded to something more recent since?

ijuma added a commit that referenced this pull request Feb 23, 2021
…10174)

As mentioned in #9548, users currently use the kafka jar (`core` module)
for integration testing and the current inlining behavior causes
problems when the user's classpath contains a different Scala version
than the one that was used for compilation (e.g. 2.13.4 versus 2.13.3).

An example error:

`java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7`

We now disable inlining of the `scala` package by default, but make it
easy to enable it for those who so desire (a good option if you can
ensure the scala library version matches the one used for compilation).
While at it, we make it possible to disable scala compiler optimizations
(`none`) or to use only method local optimizations (`method`). This can
be useful if optimizing for compilation time during development.

Verified behavior by running gradlew with `--debug` and checking the
output after `[zinc] The Scala compiler is invoked with:`

Reviewers: Chia-Ping Tsai <[email protected]>
ijuma added a commit that referenced this pull request Feb 23, 2021
…10174)

As mentioned in #9548, users currently use the kafka jar (`core` module)
for integration testing and the current inlining behavior causes
problems when the user's classpath contains a different Scala version
than the one that was used for compilation (e.g. 2.13.4 versus 2.13.3).

An example error:

`java.lang.NoClassDefFoundError: scala/math/Ordering$$anon$7`

We now disable inlining of the `scala` package by default, but make it
easy to enable it for those who so desire (a good option if you can
ensure the scala library version matches the one used for compilation).
While at it, we make it possible to disable scala compiler optimizations
(`none`) or to use only method local optimizations (`method`). This can
be useful if optimizing for compilation time during development.

Verified behavior by running gradlew with `--debug` and checking the
output after `[zinc] The Scala compiler is invoked with:`

Reviewers: Chia-Ping Tsai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants