Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make interfaces as the official ADO.NET Provider API instead of classes #15288

Closed
nvivo opened this issue Sep 26, 2015 · 174 comments
Closed

Make interfaces as the official ADO.NET Provider API instead of classes #15288

nvivo opened this issue Sep 26, 2015 · 174 comments
Assignees
Milestone

Comments

@nvivo
Copy link

nvivo commented Sep 26, 2015

From what I can see currently on the corefx-progress page for System.Data.Common, the interfaces (IDbCommand, IDbConnection, etc) were removed in favor of the usage of abstract classes.

But in the new API, most of the main methods are not virtual or abstract. On DbCommand alone we can see this:

public DbConnection Connection { get; set; }
public DbParameterCollection Parameters { get; }
public DbTransaction Transaction { get; set; }
public DbParameter CreateParameter();
public Task<int> ExecuteNonQueryAsync();
public DbDataReader ExecuteReader();
public DbDataReader ExecuteReader(CommandBehavior behavior);
public Task<DbDataReader> ExecuteReaderAsync();
public Task<DbDataReader> ExecuteReaderAsync(CommandBehavior behavior);
public Task<DbDataReader> ExecuteReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken);
public Task<DbDataReader> ExecuteReaderAsync(CancellationToken cancellationToken);
public Task<object> ExecuteScalarAsync();

While these methods can certainly be made virtual or abstract, it would be much more useful to have the real interfaces back, and make any public API depend on these interfaces instead of the abstract classes.

This is mostly useful when developing libraries. Today it's very hard to mock a datareader to make it return a specific value for testing purposes. The same for ensuring that ExecuteReaderAsync was called, not ExecuteReader, etc.

I propose the provider factory instead should be made as an interface:

public interface IDbProviderFactory {
    IDbCommand CreateCommand();
    IDbConnection CreateConnection();
    IDbConnectionStringBuilder CreateConnectionStringBuilder();
    IDbParameter CreateParameter();
}

And then follow from there to the rest of the provider to things like IDbDataReader, IDbTransaction, etc.

We know the interfaces became out of sync for some reason in the past and the abstract classes were made the official API, but this doesn't need to be the case anymore in corefx.

Note that this doesn't mean removing the System.Data.Common in any way, but instead make the Common classes implement these interfaces, and you wouldn't use System.Data.Common unless you're implementing the provider. Applications would depend only on the interfaces instead.

Please consider this to make the API more testable on corefx 1.0.

Related to discussions on #14302 and #15269.

@JamesNK
Copy link
Member

JamesNK commented Sep 26, 2015

They would have switched to abstract base classes because interfaces can't be versioned.

I'm guessing the non-virtual methods call another method that is virtual. Virtual methods hurt performance so you don't want to make things virtual unnecessarily.

@nvivo
Copy link
Author

nvivo commented Sep 30, 2015

@JamesNK I see. But with .NET Core being a new API and ADO.NET API being quite stable over almost a decade, do you think this is still a valid concern? Also, talking about database access my guess is that the cost of virtual methods is dwarfed by the cost of the database access.

@NickCraver, @roji, @FransBouma since you guys seem to have interest in the ADO.NET API, have anything to say about this?

@YoungGah, is this something worth pursuing?

@FransBouma
Copy link
Contributor

I'm guessing the non-virtual methods call another method that is virtual. Virtual methods hurt performance so you don't want to make things virtual unnecessarily.

In the process of executing a query on a remote database and processing results, the nanoseconds lost on a virtcall are negligible. Besides, ADO.NET uses this system since the beginning (so do lots of other APIs in .NET) and no-one complained that their DB code is so slow due to virtual method calls ;)

@JamesNK
Copy link
Member

JamesNK commented Sep 30, 2015

I can see async methods in your list so I'm guessing just a couple of years ago MS couldn't of added async to IDbCommand. Who knows what tomorrow will bring that will require new methods or properties.

Interfaces don't version.

Performance is just one reason not to make something virtual. Reducing the surface area for implentations maybe? I'll let someone at MS say why they decided not to, I don't know much about ADO.NET so I'd just be speculating.

@nvivo
Copy link
Author

nvivo commented Sep 30, 2015

@JamesNK I think your concerns are valid, but there are 2 important points to consider:

  1. ADO.NET has been pretty much stable since .NET 2.0, which a decade - although the async API was added later, it didn't change the behavior of the API, just added async counterparts - I don't see any big changes in the database driver paradigm anytime soon
  2. CoreFx is supposed to have a different versioning idea, since you can just keep the previous CLR for old apps. So interface versioning issues shouldn't have such an impact here

Consider also that even a sql server on "localhost" will spend at lease a few ms just to connect and return an empty query. In practice, most fast queries to relational databases take ~20ms.

Being able to mock the API with standard tools like NSubstitute or Moq is much more valuable to the developer today than saving microseconds in virtual method lookups.

@roji
Copy link
Member

roji commented Sep 30, 2015

I guess I don't have a very strong opinion here, but here are some remarks:

  • Agree with the above that removing virtual vs. non-virtual is negligible in an API for database access
  • Base classes do allow ADO.NET to provide implementations, I'm guessing that's what most the non-virtual non-abstract methods are about - the overload of ExecuteReader that doesn't accept a CommandBehavior passes CommandBehavior.Default to the overload that does. If you switch to interfaces, every provider will have to implement ExecuteReader() with the exact same boilerplate...
  • Am not sure this is valid across all major mocking frameworks, but at least in Moq isn't it's just as easy to mock a base class as it is an interface?

So overall the idea of dropping either the base classes or the interfaces seems good (simpler). Since I don't see any advantage to interfaces (unless I'm wrong about the ease of mocking), and base classes can provide common functionality (i.e. the non-virtual non-abstract methods), I guess Microsoft's approach of dumping the interfaces seems good to me...

@FransBouma
Copy link
Contributor

I agree with @roji on all his points.

@nvivo
Copy link
Author

nvivo commented Sep 30, 2015

@roji just a note, I'm not proposing droping the base classes, I'm proposing adding the interfaces as the default API. Base classes can still implement default behavior.

As for testing, I had huge issues testing if my API was calling the right methods. To check if ExecuteDataReader received the right parameters for example, you must check another protected method that is called internally with different parameters. This is far from ideal.

Currently, unless I'm mistaken, the only framework that can mock the ADO.NET API is the MS Fakes framework which can mock absolutely anything by intercepting calls. Moq and others can't do that.

I'm interested to see if other people had similar issues.

@roji
Copy link
Member

roji commented Sep 30, 2015

@roji just a note, I'm not proposing droping the base classes, I'm proposing adding the interfaces as the default API. Base classes can still implement default behavior.

Sorry, I misunderstood that. In that case, isn't your proposition more or less keeping things the way they are in .NET (not that there's anything wrong with that)?

As for testing, I had huge issues testing if my API was calling the right methods. To check if ExecuteDataReader received the right parameters for example, you must check another protected method that is called internally with different parameters. This is far from ideal.

If I understand your scenario (not sure), Moq's CallBase is useful for this kind of scenario - default implementations are inherited from the base class

@nvivo
Copy link
Author

nvivo commented Sep 30, 2015

@roji

isn't your proposition more or less keeping things the way they are in .NET (not that there's anything wrong with that)?

Not exactly. The interface API was added in .NET 1.0 and deprecated on 2.0. Since 2.0, the interfaces are there for compatibility, but there is no interface for ProviderFactory or other classes in Data.Common. There is also nothing for the async API or 2.0 or newer methods.

Moq can only mock things that are mockable. There must be some method that is either virtual or abstract that it can override, or a protected method it can call. The current API provides method for some cases, but not for most of them. There are many things that are internal, private and out of reach unless you use reflection. Only MS Fakes can do it because it replaces the reference with a shim, but this is only available on VS Enterprise and useless for open source projects.

It sounds like I have a very specific case, but certainly anyone who ever tried to mock this api faced this issue. Just google, almost every solution ends up with "mock the legacy interface API or build a wrapper you can mock":

@roji
Copy link
Member

roji commented Sep 30, 2015

@nvivo OK, thanks for the extra details - I admit I haven't gone very far with mocking ADO.NET.

The thing I don't understand, is why you would want to mock internal, private and otherwise out of reach methods of an API. Shouldn't you be mocking public methods that are directly available to your own application code (which is what you're trying to test)? I do see the issue with the non-virtual methods (e.g. the 0-parameter ExecuteReader() overload), but given that in ADO.NET these always (?) call some virtual overload (e.g. ExecuteReader(CommandBehavior)), is there a real issue here?

Just trying to understand your problem scenario, can you give a simple example?

@YoungGah
Copy link

@nvivo We currently have no plan to bring interfaces in because of the versioning issue which was already pointed out by several people on this thread. Good example of interfaces getting behind is when async and streaming methods were added in .NET Framework 4.5. When we added those new features, we carefully looked into extending interfaces. The options we had at that time were either provide InterfaceFooV2 or separate interfaces for asyn and streaming. We didn't want to add InterfaceFooV2 as we can foresee that we would want to add more APIs in the future. Keep adding separate interfaces for each new functionality would be confusing as they are not tied to the existing interfaces.

@nvivo
Copy link
Author

nvivo commented Sep 30, 2015

@roji I had cases where I want to ensure that a specific overload of ExecuteReader was called, and not that "any of the overloads". It's the kind of thing you have only in libraries, not on user code.

@YoungGah thanks for the info. I'm closing this then.

@nvivo nvivo closed this as completed Sep 30, 2015
@mythz
Copy link

mythz commented Nov 24, 2015

Do people responsible for this change have any idea of its effect? The core ADO.NET Interfaces have been around for over a decade and data access being the center of most business apps, I'm having trouble conceiving how not purposely breaking so many existing code bases isn't the highest priority? These are some of the most critical high-level interfaces in .NET, removing these breaks every ADO .NET data access library out there and by consequence every project using it. Removing them creates an artificial fragmentation, causing frustration and confusion that will hamper adoption of CoreCLR.

You can still version interfaces and make them source compatible by just adding extension methods for any New API's off IDbCommand, e.g:

public interface IDbCommand
{
    //...
}

public class DbCommand : IDbCommand
{
    void NewApi();
}

public static class DbCommandExtensions
{
    public static void NewApi(this IDbCommand cmd)
    {
        ((DbCommand)cmd).NewApi();
    }
}

The core IDbCommand interface never has to change after DNX is released and you can continue adding functionality with the strategy above. You could also later (in a major breaking version), roll up these extensions and merge them into the core interface. Either way we get the core stable ADO.NET interfaces that's critical for migration of existing code bases and consequently for adoption of CoreCLR.

@mythz
Copy link

mythz commented Nov 25, 2015

I've been asked by @davkean to provide concrete examples of what impact removing core ADO .NET interfaces will have. I can't imagine this change was considered without evaluating the immeasurable impact it would have on the existing .NET ecosystem, but then again it's also been done so there's a possibility it wasn't considered - which I'm going to assume here-in it wasn't.

Despite EF's role of being .NET's default ORM and its outstanding success in capturing a large majority of market share, there's still a large population of .NET developers who prefer to instead use alternative ORM's for a number of different reasons. E.g. an important feature as it relates to CoreCLR is that they have first-class support running on Mono/Linux/OSX as well as supporting multiple alternative RDBMS's. Since CoreCLR is pitching heavily for the Linux/OSX developer market, the more support there is for alt RDBM's, the better. Another important trait of the population of devs who adopt Micro ORM's is that they've evaluated outside of MS ecosystem defaults to choose the ORM that's most appropriate for them. From everything I've seen there's a high correlation between active .NET OSS (i.e. anti-Dark Matter) devs and devs who adopt Micro ORMs, likewise I expect this to have a high correlation with early adopters of CoreCLR - whose major value proposition is to develop on OSX/Linux. These are some of the reasons why it'd be beneficial to include the surrounding .NET ecosystem in your decision making when making fundamental breaking design choices like this.

Alternative ORM downloads

A cursory glance at NuGet downloads provides an indication of what the non-EF market-share looks like:

NHibernate - 1M+
Dapper - 1M+
OrmLite - 500k+
Simple.Data - 300k+
PetaPoco - ~100k
NPoco - 30k+

The real numbers are a lot more than this as many Micro ORM's like Dapper, Massive, PetaPoco, NPoco, etc were designed to fit in a single drop-in .cs so NuGet isn't reporting its true usage. There's also closed source ORM's like LLBLGen Pro which have a large user base but its usage isn't reported by NuGet, likewise I'm sure I've missed a number of other ORM's I forgot / don't know about.

Impact to alternative ORM's

Thanks to GitHub we can do a quick search to see how many different source files contain the core
IDbConnection, IDbCommand and IDataReader ADO .NET interfaces impacted by this change:

IDbConnectionIDbCommandIDataReader
NHibernate59181132
Dapper172117
OrmLite1795426
Simple.Data29276
NPoco4103

Note: these results only show source files, the actual number of broken references are much higher.

Impact to Customer Source code

The actual impact of this change also extends to all projects dependencies using these ORM's.
Unfortunately the effect isn't limited to internal implementations as it also breaks the customer
source code as many Micro ORM's are just extension methods over ADO.NET Interfaces so the client
code looks like:

IDbConnection db = ...

//Dapper
db.Query<Dog>("select Age = @Age, Id = @Id", new { Age = (int?)null, Id = guid });

//OrmLite
db.Select<Author>(q => q.Name.StartsWith("A"));

One extensibility feature from using extension methods is that these ORM's are "open-ended" and Customers can extend ORM's with their own first-class API's by adding extension methods in their own projects - these are broken as well.

Obviously any source code that passes IDbConnection around is now also prohibited from working on CoreCLR.

E.g. the core ADO.NET Interfaces are heavily used throughout high-level frameworks like ServiceStack, adopted because it's the minimal dependency to enable multi-RDBMS data access. It was also assumed out of all the classes that were unlikely to change, it would be the core ADO.NET Interfaces.

Summary

I'm personally astonished there was ever going to be a future in .NET without these interfaces.
Interfaces are by design purpose-specific to allow for multiple implementations and ADO.NET is one
of the most important "open provider models" in .NET. I've no idea what priorities caused these interfaces to be removed, but both the massive existing .NET code bases that rely on these interfaces along with the alternative-EF .NET ecosystem should be given a much higher priority. This is causing a significant disruption and is a major barrier required to support both existing .NET 4.x and CoreCLR platforms, forcing a non-trivial amount of additional complexity that must be applied to all existing code-bases affected by this.

The current perception is that ADO.NET/CoreCLR is being re-designed to provide first-class support for EF and SQL Server with the rest of the ecosystem being disregarded - non-transparent breaking decisions like this only goes to re-enforce this stereotype.

@davkean
Copy link
Member

davkean commented Nov 25, 2015

As a previous member of the .NET team (I now work on Roslyn), I was heavily involved in the original design of the new data common, along with the SQL and Entity Framework teams. I'm not involved in it at the moment, but I can add some background to help correct some of the statements that I'm seeing on twitter and above.

The current design of System.Data.Common for .NET Core started in December 2012, approximately 2 years before we open sourced.

Goals:

  • Design a modern surface area for .NET Core, that reduced duplication of concepts (IDbConnection vs DbConnection), confusions, mistakes and layering issues (split SqlClient from DataCommon, split DataSet from core abstractions) from the original design from .NET 1.0. One that would be easily picked up, both by existing consumers and new developers to .NET Framework.
  • Enable providers and consumers to build a single binary/source against .NET Core, and then run that same binary on .NET Framework. Make note, the reverse was not a goal; being able to take .NET Framework binary/source and run it without some changes on .NET Core.

Correction of a few things being spread around:

  • The interfaces as they stand are not versionable. We cannot add members to interfaces, and the proposal above provided by @mythz via extension methods requires that providers derive from the abstract base classes anyway.
  • System.Data.Common has not moved away from the provider model. The interfaces were removed because they were a legacy .NET 1.0 concept that was replaced/duplicated by the abstract base classes introduced in .NET 2.0. At the time we made this decision, every provider that we could find derived from the base classes.
  • Like the interfaces, the base classes are mockable.
  • We understood there would be some changes needed for those using the .NET 1.0 interfaces, however, it is a very simple port to move to the base classes. For example, see these few lines of change for AutoMapper: (https://github.com/AutoMapper/AutoMapper.Data/blob/master/AutoMapper.Data/DataReaderMapper.cs#L14).

@mythz
Copy link

mythz commented Nov 25, 2015

Something I'm having trouble understanding:

We cannot add members to interfaces

How is it not ok to add members to CoreCLR interfaces yet, it's fine to rip them out entirely?

the proposal above provided by @mythz via extension methods requires that providers derive from the abstract base classes anyway.

The important part is that the interfaces exist and allows source code that references them to compile.

If you don't want to version the interfaces, fine EOL them, just restore the interfaces as they were before they were ripped out and mitigate the burden now imposed on every other library using them. I mean these core Interfaces were never obsoleted with no warning or migration path provided. Yet we're getting punished for adopting a published, well-known, stable API as an integral part into our libraries?

it is a very simple port to move to the base classes.

This needs to be added on every source file that references the ADO.NET Interfaces, and forces Customers to litter their code with custom build symbols.

There doesn't seem the same care for backward compatibility here, but purposely breaking existing customers in a future release is just not an option (I'm surprised it's even considered with ADO .NET's much larger market share). We can't break existing 4.x customers and yet we're being asked to support CoreCLR - So where does this leave existing 4.x libraries that want to maintain existing backward compatibility and also support CoreCLR? Should we be duplicating docs/examples as well?

@davkean
Copy link
Member

davkean commented Nov 25, 2015

How is it not ok to add members to CoreCLR interfaces yet, it's fine to rip them out entirely?

The surface area in .NET Core needs to be binary compatible with .NET Framework to enable 1st parties and 3rd parties to build against .NET Core, and run portability without changes on .NET Framework. Adding members to interfaces violates that, as consumers of those members would fail when they ran on .NET Framework.

I'm not arguing for the removal or addition of these interfaces, I just wanted to add some background to why the design ended up where it is. I'll let the current owners including @YoungGah and @saurabh500 tackle that.

Just to summarize the thread, the reason that you believe Microsoft should port these interfaces, is to enable the ecosystem to easily port to .NET Core, while maintaining their .NET Framework implementations?

@mythz
Copy link

mythz commented Nov 25, 2015

is to enable the ecosystem to easily port to .NET Core, while maintaining their .NET Framework implementations?

Yes.

@FransBouma
Copy link
Contributor

Just to summarize the thread, the reason that you believe Microsoft should port these interfaces, is to enable the ecosystem to easily port to .NET Core, while maintaining their .NET Framework implementations?

Yes. External APIs are now broken if I port my codebase (LLBLGen Pro) to corefx: I then have to expose 2 apis or break the existing codebase for all my users.

It might be fine for you people to break our stuff as you don't feel the pain, we do. It's not fine by me: I have to either live with a butchered code base and maintain 2 APIs which do the same thing, OR break my users' code because you thought that was OK.

I also don't get why interfaces don't version, it's just an interface, like a class has an interface too. CoreFX can perfectly fine add the async methods to the interfaces.

The surface area in .NET Core needs to be binary compatible with .NET Framework to enable 1st parties and 3rd parties to build against .NET Core, and run portability without changes on .NET Framework. Adding members to interfaces violates that, as consumers of those members would fail when they ran on .NET Framework.

Easy solution: add the interfaces as they are now. And once you all come to your senses that this rule above is actually rather stupid, you can add the methods you needed to add to the interfaces long ago to the interfaces and move on.

I work with MS software long enough that rules like the one above are great on paper but in practice are broken the second an important MS team needs it to be broken. If you are so 'open' and 'different' as you say are in the CoreFX marketing/pr hoopla, show it. All I see with respect to System.Data and CoreFX is 'what MS needs is done, what everybody else needs is on the backburner or ignored'.

@FransBouma
Copy link
Contributor

Another thing I forgot to mention: Fowler mentioned yesterday on Twitter that you need everybody to port their stuff. I have to pay for porting my 500K LoC codebase to CoreFX myself, it takes time, effort and will take away time for other features. Extra friction that's totally artificial (it's a new platform! How can there be restrictive rules?) really doesn't help at all: it adds extra maintenance costs, takes extra time to port the code and test and gives extra burden for our users.

All that is out of your scope, and not your concern it seems. But you forget one thing: what if we don't port our code and with me more people? I'm willing to invest time and thus my own money to port my large codebase to your new shiny framework, but sorry to say it, whenever I run into a problem I'm met with restrictions, odd rules and endless debates ending in silence. I.o.w.: I feel very much left alone while at the same time you seem so desperately want us to like your new shiny framework.

Like I said a long time ago : Sell me this framework, this new CoreFX. Well, keeping friction and introducing a lot of moved and taken away cheese is not creating a large incentive to invest a large amount of time (and money) into this.

Just my 2cents.

@davkean
Copy link
Member

davkean commented Nov 25, 2015

@FransBouma Please let's try and keep this conversation professional, productive and focused on facts.

I'm not arguing for or against adding the interfaces. However, it is not compatible to add methods to interfaces. Let's walk through this:

  1. Add IDbConnection.OpenAsync to .NET Core
  2. Anyone who calls this method, will now fail to run on .NET Framework (breaking a core principle/goal that I called out above). This also breaks the XAML designer and a few other VS features which relies on this very fact.
  3. To bring .NET Framework up-to-date, we ship a new version of .NET Framework "4.7" with IDbConnection.OpenAsync
  4. Every single type that implemented IDbConnection prior to adding this method now fails to load on .NET Framework "4.7"

This is why we cannot add methods to interfaces.

@FransBouma
Copy link
Contributor

If I keep my frustration with how things go with respect to communicating issues with MS to myself, you all won't know about it and think everything is roses and rainbows. If that looks unprofessional, so be it, I'm beyond caring whether MS thinks I'm a professional or not.

That said: I'm not married to the interfaces, so if they're gone, the fact that from then on there are classes and no interfaces to work with in theory won't make me a sad panda: what should be done can be done in theory through the base classes as well, today, as today all major ADO.NET providers play nice and derive from the base classes (this hasn't been the case in the past IIRC with ODP.NET implementing an interface but not deriving from the base class). This is also the reason why I initially up above earlier in this thread didn't really think removing them was a big deal. Since then I had some time to think about it and I think it is a big deal.

We don't live in a vacuum on Mars, and middleware/frameworks at the bottom of the stack have a problem now: users of current .NET full versions of these frameworks want to keep using them on CoreFX as they know these frameworks. Porting them over to CoreFX is however a big PITA, because of a myriad of reasons, one of them being often used interfaces exposed in public APIs not being present on CoreFX (and the reason for this thread).

For that reason alone I'd like to see the interfaces back. For me personally not for technical reasons (e.g. async needs base classes, it's a mess already). I know they lack certain methods, but that's your problem, not mine. Removing them makes that my problem and (paraphrasing now) the MS response to that is: throw up your hands with "can't be done!". But I don't have that luxury. You created this mess, you solve it. You want me to port my code, to invest a lot of time and money (which I have to pay for myself) to support your new framework, why are you then making your problem my problem?

Looking at your 4 step scenario: adding methods to interfaces isn't a problem IF you see CoreFX as a separate framework. And isn't that the case anyway? It's the same as with Compact Framework all those years ago (which I did port my framework to, and I learned a couple of hard lessons then that tell me that porting to CoreFX won't be simple, fast and easy and keeping two code bases won't either): we start with 1 API, then someone forgot something or some team within MS needs something, and viola a breaking change only a handful low-level stack devs will run into and so on and the two roads will split.

(example: Compact Framework forgot 'SerializableAttribute'. They added that with a dummy attribute doing nothing in a later version, but that broke code which anticipated on that not being present and which defined their own)

Splitting roads is understandable though: trying to keep things compatible is too restrictive. I predict here now that this rule will be broken in the future.

Seeing things as 'compatible' is important not only at the API signature level, but also on the API behavior level. Trusting that those two will be completely the same (CoreFX and .NET Full) in API behavior is too risky: a framework developer will have to test the same functionality on CoreFX and on .NET full, there's no way testing on CoreFX alone will be enough to assume the code works 100% the same on .NET full in the future: because how can you guarantee that? A call stack 20 calls deep on CoreFX has touched so much other code than on .NET full, a small detail here and there and things change.

The point in all of this is: it's a separate framework: code compiled against CoreFX can be expected to be different from code compiled against .NET full.

There are a couple of situations:

  1. a framework has a code base of which 100% compiles on CoreFX. This gives a dll which is runnable on .NET full
  2. a framework has a code base of which 70% compiles on CoreFX and 100% on .NET full. This gives 2 dlls: one for CoreFX, and one for .NET full. It's silly to run the CoreFX version on .NET full, as one misses 30% of the functionality.

In case of 1) I understand your point. In case of 2) (which is the case for all current .NET full targeting frameworks, among them all 3rd party ORMs) your point is really meaningless, as they'll have to work with 2 dlls anyway: effectively 2 codebases which have to be maintained separately, tested separately and migrated to their own new versions separately. Especially if CoreFX gets new features which aren't part of .NET full (which will be the case) yet. (btw: if you add DbDataReader.GetSchemaTable() to CoreFX which returns a different datastructure than a DataTable, because MS refuses to port that, code using DbDataReader.GetSchemaTable on CoreFX will break on .NET full as well. If you name it differently it will break as well as the method isn't there. I.o.w.: code will break if things which aren't in both frameworks are used. That doesn't mean things thus shouldn't be present in CoreFX).

To have no interfaces on CoreFX makes the situation of the framework in situation 2) a persistent one: they can't move to become a framework which fits in 1) because e.g. their API exposes the interfaces.

Microsoft rewriting their own stuff so their frameworks become frameworks in situation 1) is cool, however we don't have a million$ budget, 15+ people on the ORM runtime and a big PR machine on our side who will smooth over the wrinkles of breaking every app out there. So we're either stuck in 2) or require a little help from MS to move to 1).

That's what's at stake here. You said on twitter "tell us what you need". We did. Repeatedly. Especially regarding System.Data there is no communication. Nothing. No future plans, no discussion what to do, just dead ends and sometimes if a MS person steps in it's one who has no real stake in the matter. I appreciate your time on this, the more background we get the better, but at the same time, it's like talking to a co-worker about this: it won't get solved because the person(s) in charge are absent and not participating in the discussion.

If that makes me sound frustrated and god forbid 'unprofessional', then so be it.

Thanks for listening. Btw I have no illusions wrt System.Data: it will be a trainwreck of an API to port code to and as there's no communication from the people in charge with developers who write key frameworks on top of their API, there's little to no hope things will change. Not your fault, @davkean, it's nothing personal.

@NickCraver
Copy link
Member

I have to echo the frustrations above about the lack of communication. We need bulk inserts and schema information as well. There has been no advancement or communication in over a month (see #15269 and #14302) of these missing core (in both ways) functionality. Yet, Microsoft is labelling the current code as "a candidate for release", which itself is a message of "it's good enough." It's not. Core things are missing that need to be added and if you follow these threads, need to be in the first version for similar versioning reasons.

Look at the last update on #14302 ("Why is DataTable/View/Set Absent?"), it's from 22 days ago asking:

So System.Data.Common is feature complete now for V1 of CoreCLR?

Yes, frustration can come off as unprofessional. The tone and context of text sucks and always has, but that's what we're restricted to here. I think everyone is trying to be productive here, but we're getting quite a bit of stonewalling from the CoreFX side on actual progress in the System.Data area and that is, to be blunt, infuriating both as a library author and a user of these bits.

We need these core functional pieces, interfaces or not - I'm not hard set on interfaces and we've ported Dapper without them. But lack of DataTable, result schema info, bulk insert, and such are unacceptable in a "release candidate". Microsoft is the one increasing the frustration with labelling the current code as RC when it's almost universally agreed that it's not ready for release. Yes, it's just a label, but it's both an incorrect label and one that drastically increases the level of urgency because it's based on an arbitrary schedule (that should have changed to reflect reality). I don't think anyone in this thread is responsible for that schedule, but it's worth stating as a major factor in the frustration level.

Let's get back to the root problem. We need these pieces, and many of our millions of users do to. So let's fix it.

@mikeobrien
Copy link

Lets not forget NHibernate with 1M+ downloads:

IDbConnection IDbCommand IDataReader
59 181 132

@mgravell
Copy link
Member

The current perception is that ADO.NET/CoreCLR is being re-designed to provide first-class support for EF and SQL Server with the rest of the ecosystem being disregarded - non-transparent breaking decisions like this only goes to re-enforce this stereotype.

That perception is reinforced by things like this: https://github.com/dotnet/corefx/issues/4646

As far as I can tell, there is zero way of implementing that API in any useful way outside of the SqlClient assembly.

@nvivo
Copy link
Author

nvivo commented Nov 25, 2015

I'm currently ok with testing without interfaces. But honestly I don't get the reasoning with interface versioning and compatibility.

Isn't the idea of .NET Core is that it's a new framework without the burden of compatibility, and that it's bundled with your application, so you don't have to deal with issues like that? The provider is already incompatible with the ones in .NET due to lack of things like the schemas and datatables, so what would break compatibility with? If the interface changes, just compile against the new version and bundle it with your app.

It just sounds like most of the excuses for the design are just worries from the old framework that are not applicable to the new one. Anyway, let's see how it turns out to be in practice.

@mgravell
Copy link
Member

For those of use who intend to support multiple frameworks, and have historically targeted the interfaces... I just want to share a pile of ugly that Dapper uses; I'm not saying this is good, but it is enough to make it compile. Of course, it is duplicated in a huge pile of files... I am mainly sharing this to emphasize yet another of the impacts:

https://github.com/StackExchange/dapper-dot-net/blob/master/Dapper/SqlMapper.cs#L6-L16

@ploeh
Copy link
Contributor

ploeh commented Nov 25, 2015

We cannot add members to interfaces

Correct, and that's a good feature of interfaces. The preference for abstract base classes is the safest way to help API entropy along, instead of fighting it.

While you don't have to follow the principles of OOD, I'd suggest that you do, when creating OO APIs. In short, the Interface Segregation Principle (ISP) states that no client should be forced to depend on methods it does not use.

If you add new methods to an existing abstraction, you automatically violate the ISP.

You can decide that you 'don't have to to adhere to SOLID' because you're Microsoft, and you're working with the BCL, so therefore 'normal rules don't apply' (not actual quotes; just paraphrasing the normal counter-arguments).

Having maintained a couple of open source projects for 6-7 years, in my experience it's better to keep interfaces small. If you need to add new capabilities to an abstraction, introduce a new interface.

@jchannon
Copy link

👍

@mgravell
Copy link
Member

Awesome; is there a milestone for this to hit nuget? rc3?

On 25 February 2016 at 02:54, Saurabh Singh [email protected]
wrote:

We have added the following interfaces to .Net CoreFX in System.Data.Common

IDataParameter
IDataParameterCollection
IDataReader
IDataRecord
IDbCommand
IDbConnection
IDbDataParameter
IDbTransaction

This was done in the PR dotnet/corefx#6359 dotnet/corefx#6359


Reply to this email directly or view it on GitHub
https://github.com/dotnet/corefx/issues/3480#issuecomment-188577701.

Regards,

Marc

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

As commented on the PR, we need to understand why there are no async methods on these interfaces. As it was proposed, this is basically a trimmed down version of ADO.NET 1.1 interfaces, and I don't think the idea should be only compatibility with old code.

The interfaces should focus on the current state of ado.net, as async methods should be the default way to access any database today. Without real support for async methods, these interfaces are useless for modern develop
development.

And even including the async methods from .NET 4.5, some additional methods, like DbTrabsaction.CommitAsync should be added as well.

The postgres provider added some additional methods like CommitAsync to their api which are quite useful and necessary.

@jgauffin
Copy link

The current interfaces are fine as they are. The implication of changing them is just too large.

The async model is quite different from the synchronous one and as you might know, if you go for the async model you should do it all the way. Hence there are really no reason to have the same interfaces for both APIs. Create new ones for the async API.

If the .NET team wan to provide a more modern API, why not just create a new API which is not called ADO.NET? No legacy to be hindered by and no complaints from the community. That also fits well with how dnx is going to be distributed? i.e, independent packages.

@NickCraver
Copy link
Member

👍 on the interfaces, good compromise.

I don't think the idea should be only compatibility with old code.

That's the entire idea here. Otherwise, base classes would be fine. That's a whole lot of porting pain we want to avoid.

Without real support for async methods, these interfaces are useless for modern development.

I disagree with this, but I don't disagree with an async version of the interfaces necessarily (which no one implements today). This would be a new feature. We can't retroactively be adding members to existing interfaces, that just breaks far too many things. Having an IDbReaderAsync or something isn't crazy IMO, but it is a different discussion.

I strongly believe async methods should not be on the base classes, if the default implementation is a sync wrapper - that's actively bad and wasteful. If there's another proposal there then so be it, but again: that should be a different issue anyway.

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

Ok, maybe I expressed myself in the wrong way here or was too strong with my words.

I'm in favor of an additional interface for async if needed. What I'm not OK with is having something that defines an official contract for ADO.NET (that's what interfaces are) but doesn't have async methods anywhere in it.

But then, having alternative interfaces for async methods would probably cause other problems...

I strongly believe async methods should not be on the base classes, if the default implementation is a sync wrapper - that's actively bad and wasteful.

I agree, this is the main reason why most providers don't bother implement a real async api. But changing that would break much more code and probably cause much more noise than removing interfaces, as the base classes have been the actual API for providers since 2.0.

Upgrading a library to not use any of the 1.1 interfaces would cause almost zero impact compared to removing all async code written in the last years, that would be disastrous. The compromise is having both. Any code written today should be using async apis, so leaving that out doesn't make any sense.

@NickCraver
Copy link
Member

Any code written today should be using async APIs.

I don't want to wound too harsh, but that ideal world is a very far cry from reality. Async is very pervasive and infectious. You simply can't rely only on async APIs in libraries and expect entire applications to be async consumers (changing a ton of their code to be async as well) on a whim. Sync -> Async everywhere is also very bad for lots of deadlock and efficiency reasons. There will be synchronous code written for many years to come.

There is a strong need for both APIs. The point is: let's not remove the current ones or delay their presence for a hypothetical and not-yet-designed new set. We can worry about the second/new set independently.

Upgrading a library to not use any of the 1.1 interfaces would cause almost zero impact compared to removing all async code written in the last years

To what are you referring? There have been no async APIs for such code to exist. If you're relying on such APIs they're not on a base class or an interface, but on a provider directly. That will not be impacted by this.

Any code written today should be using async apis, so leaving that out doesn't make any sense.

Leaving out a lot of things doesn't make sense...except for the fact we're all constrained by resources (especially time). I don't believe anyone has left anything out permanently. Nothing is off the table. It simply hasn't been gotten to yet. I'd open another issue specifically for starting a spec on async interfaces for a future generation.

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

To what are you referring? There have been no async APIs for such code to exist. If you're relying on such APIs they're not on a base class or an interface, but on a provider directly. That will not be impacted by this.

.NET 4.5 introduced async methods on the provider base classes. This was in 2012, almost 4 years ago, so it has been part of the ADO.NET provider API for a while. Entity Framework 6 (released on 2013) depends on this async APIs for all providers.

Async methods are already part of the ADO.NET for enough time that a lot of people would scream if that was not included in .NET Core. I have legacy code that uses async methods in ADO.NET.

I'm advocating that since they're already part of ADO.NET, this should be present on the new interface API as well.

@mgravell
Copy link
Member

If people want (and they should) to use the async APIs, they can already do
that before this change by using the base types. Ultimately, the request
to support the interfaces was made for backwards compatibility reasons;
adding methods to an interface completely blows this out of the water.
That said, it is actually just about possible as extension methods and
type-checking against the abstract base-types, but... pretty ugly and not
worth the pain IMO.

So; short version: I can't personally get behind adding async to the
interfaces, as that destroys the one thing we wanted them for in the first
place. If you want async: you need to code against the base-classes, or use
tools that gloss over these details for you.

@mythz
Copy link

mythz commented Feb 26, 2016

I'm advocating that since they're already part of ADO.NET, this should be present on the new interface API as well.

You're completely misunderstanding the purpose of these ADO.NET Interfaces which is to maintain compatibility with existing code. These aren't new interfaces, these are existing interfaces. If you want to access the newer API's, reference the concrete base types.

@NickCraver
Copy link
Member

@nvivo Apologies, I'm just not following you - I was talking about interface APIs - those have never existed. The base types already have all of the same *Async methods - is there something specific that's missing? I think you're arguing that they should be bundled into interfaces...yeah sure, but that's another issue which I encourage you to open.

I'd much rather they have been an interface from the start since the base implementations needed to make the current approach work (async over sync) are terrible tradeoffs to make the whole approach work. But, we also can't have it both ways: either they move to interfaces or they're present (as is the current case) to minimize breaks.

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

Yeah, I think we're going in circles here. I stated this before, I don't think interfaces should be added just to help porting code. From a compatibility standpoint, the base classes have been the official API for ADO.NET since 2005, and that's what providers implement. Anything using an IDbCommand or IDbConnection could easily be ported (and should have been ported) to use base classes with a search/replace and have no downsides.

I know you're not a fan of ifdefs, but supporting this for a new platform will only be part of the migration anyway.

I agree this should have been interfaces all along, but since they weren't, I'd like to see this problem not repeat itself. If interfaces are being added, they should at least represent the current API, and not what they were a decade ago. Async methods are an integral part of the current API and that's the direction Microsoft is moving for quite some time. It would still be source compatible, just more complete.

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

@mgravell

If people want (and they should) to use the async APIs, they can already do that before this change by using the base types.

This is not about being able to do anything. It's about architecture. Interfaces are contracts, .NET Core is a new framework that is adding this contract to a redesigned version of the API.

.NET core should not add an official contract to a new API just to help migrate really old code, while most of the other stuff will be missing anyway. If that is a concern, people are just not looking enough for reasons they will need to change their code anyway.

If that's all the team is doing, than that's OK.. it's just a bad choice IMO.

@mythz
Copy link

mythz commented Feb 26, 2016

Anything using an IDbCommand or IDbConnection could easily be ported (and should have been ported) to use base classes with a search/replace and have no downsides.

False. The issues have been discussed numerous times in this thread from multiple library authors with first hand experience affected by this.

I know you're not a fan of ifdefs

Any solutions requiring end customers to use ifdefs is a broken dev experience and a non-starter, i.e. there will never be a successful product requiring customers to litter their code with #defs when alternatives exist.

If interfaces are being added, they should at least represent the current API

These are not new interfaces, they're restored interfaces. The current and future APIs are the base classes, not these interfaces. There should be no issue here, you can forget these interfaces exist and continue using the base types just as before these interfaces were restored.

There's no new value being added to this thread anymore. The existing ADO.NET interfaces have been restored so this thread can be put to rest. Only thing needed from this thread are updates to DataTable and GetSchemaTable() as they pertain to the existing interfaces. If you want to propose architectural changes or advocate for new interfaces, open a new issue - which will stop everyone in this list from being spammed.

@nvivo
Copy link
Author

nvivo commented Feb 26, 2016

@mythz let's agree to disagree.

@aidapsibr
Copy link

Just adding my 2 cents as another ORM developer, abstract classes are always a code smell when not backed by an interface. Would love to see new interfaces provided to match the abstract classes and method signatures overloaded with a minimum requirement interface API.

Thumbs up to the community for speaking up.

@davkean
Copy link
Member

davkean commented Mar 1, 2016

abstract classes are always a code smell when not backed by an interface

@psibernetic Can you help me understand that statement? What about this is a code smell?

@eocampo
Copy link

eocampo commented Mar 1, 2016

@psibernetic

Interfaces and abstract classes give us contract, both give us an abstraction and a good definition for the API. Interfaces are most useful when implementing classes that could implement more than one interface or are subclasses of another base class (assuming that is a very big advantage of that base class). In this case in particular, concrete classes for Connection, Command, and so on for specific providers have a strong IS A relationship with the abstract API definitions. I really can't imagine a scenario when some developer needs to add a concrete implementation for IDbConnection or IConnection to a subclass. The almost only scenario will be new classes that only derive for the abstract class and to "duplicate" the same definition on an Interface is more work (unnecessary) for the API designer.

Do you see a specific and concrete advantage or scenario to have two equal abstractions? When the interface does provide practical and real benefit over the abstract class in this specific API design?

The only advantage that I can think of for the Interfaces is the backward compatibility that we need with the old ones to break less actual running code depended on those interfaces. If we doesn't had the old interfaces I'm pretty sure that the abstract classes will be enough.

@aidapsibr
Copy link

@eocampo You are right that the abstract classes probably provide "good enough" abstraction and contracts. I always try to provide very narrow interfaces that represent actions that can be taken such as IAsyncCommand and the likes. That allows my frameworks to be plugged into in ways that may have not been considered at design time of the framework with less chance of terrible NotSupportedExceptions or NotImplementedExceptions.

@davkean The code smell is that in most cases, although not all, you are requiring an implementer to implement or inherit an entire base set of functionality that may not be relevant. I remember seeing IDataReader implementations that read from a cache or in memory instead. Not sure if the DbDataReader abstract class would allow that, but the name implies no.

@aidapsibr
Copy link

The best-practices model followed predominately in dot net has been expose interfaces and inherit from base classes has it not?

@eocampo
Copy link

eocampo commented Mar 1, 2016

The best-practices model followed predominately in dot net has been expose interfaces and inherit from base classes has it not?

@psibernetic Well not always. For example this recommendation on the MSDN site has more than a decade there. And that guideline is very common from .Net Framework 2.0 at least.

Also this is a good reference of the guidelines for library design in .Net from the early days:

http://www.amazon.com/Framework-Design-Guidelines-Conventions-Libraries/dp/0321545613

@eocampo
Copy link

eocampo commented Mar 1, 2016

Anyway I think the hard discussion here is about two subjects now:

a) The Interfaces are just for backward compatibility or can we "start from scratch" (breaking code) to allow a cleaner interface and API design.
b) How much can we go for a modern and clean design at the cost of not be compatible with the full .Net framework. (Compatibility specifically between .Net Core and Full core on data access [not the most low level and mandatory compatibility])

From my perspective, if we have the abstract base classes as the primary and preferred contract, then the interfaces must match the old ones just for compatibility. I understand that @nvivo already has stated that after .Net 2.0 the official contract was the abstract base classes so we could think that interfaces will not resolve the compatibility issue but @mythz and @mikeobrien had also provided hard data here about the dependency of providers on the 1.1 interfaces.

To stop spamming and discussing the topics here we will need to read again this long conversation and I don't know if we can agree on the LIST of specific topics that we are addressing or if it is a good idea to create two or three new issues for each specific topic. I'm more of the first suggestion because there is a lot of good points here. I don't have a good idea of how we can re-summarize all these and clear some noise (even mine).

@Halofreak1990
Copy link

Speaking of interfaces, are there plans to finally make parts of System.Data generic? It has always bothered me that System.Data never really got its API updated beyond .NET 1.1, leaving people with having to use hacks like the .AsEnumerable() extension method to get an IEnumerable out of a DataTable. Why haven't collections like DataRowCollection been upgraded to implement the generic interfaces when everything else in the framework did when 2.0 came out?

@AndrewZakharkin
Copy link

Will there be a stub System.Data with type redirections in it? I need to use ODP.NET but now I cannot.

@AndrewZakharkin
Copy link

Created dotnet/corefx#7874

@daxfohl
Copy link

daxfohl commented Dec 17, 2016

@mgravell @ploeh "Rickasaurus" implied typeclasses were on the horizon (for F# at least, not sure about C# or .NET in general https://news.ycombinator.com/threads?id=Rickasaurus). If it's the case that they're coming for all .NET, would it resolve the issue?

I'm not a Haskell expert, but my understanding is they'd allow you to bolt on a simple IDbConnection, IDbConnectionAsync, and any future shared interfaces after-the-fact without breaking source or binary compatibility, and without forcing 3rd-party providers to implement everything. This, while retaining easy mockability.

Is this a correct understanding? If so, is there any chance this feature is coming to .NET for real?

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 1.1.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests