Maybe a stupid question... List<? extends RealLocalizable> vs List<RealLocalizable>

Sorry for the generic java question but it relates to work on imagej-ops…

When should you use SomeClass<? extends SomeInterface> as inputs to functions or return values and when should you use SomeClass<SomeInterface>? I seem to see both patterns. Are they equivalent?

Thanks,

Jay

Good morning Jay,

the two are not equivalent. General example: A Collection<SomeClass> will only take objects of type SomeClass while a Collection<? extends SomeClass> will also take classes that are subclasses of SomeClass (see https://docs.oracle.com/javase/tutorial/extra/generics/wildcards.html section on Bounded Wildcards).

This directly translates to Ops: a UnaryOp<? extends RealType< ? >, DoubleType> takes a value of a type that extends RealType<?> as input and “returns” an instance of DoubleType.

All the best,
Stefan

1 Like

It is better to think of ? as “some unknown subtype.” So e.g. a Collection<? extends Number> means “a Collection typed on some unknown subtype of Number.” It might be a Collection<Integer> but it might be a Collection<Double>. At this scope, the compiler does not know which, and so only allow operations that would work with any possibility.

The way it is phrased above implies that a Collection<SomeClass> would only accept objects typed on exactly SomeClass rather than SomeSubclass extends SomeClass, which is not the case. Whereas a Collection<? extends SomeClass> will not accept a SomeClass object as argument to e.g. add, since the collection is actually typed on some unknown subtype of SomeClass.

The rule of thumb for API is:

  • If possible, use a type variable in the method declaration when the method takes an input involving that variable; e.g. <B extends Bar> void processFoo(Foo<B> foo);
  • If you don’t want/need to use a type variable, then favor Foo<? extends Bar> over Foo<Bar> as method argument, so that callers can pass e.g. Foo<SubBar>.
  • Return Foo<Bar>, not Foo<? extends Bar>, as return type. Because getting a Foo<? extends Something> back from a method call causes lots of difficulties.
  • Lastly, do not use a type variable when the method does not have an argument involving that variable. E.g., Foo<B> computeFoo(String expression) causes problems, because there is no way for the code at runtime to discern which B you need. The compiler allows this in certain circumstances, but it is not type safe. There are currently some places in Ops that need fixing to avoid this issue (see e.g. imagej/imagej-ops#160).

In other words: be permissive in what you accept, and strict in what you produce. Of course, like any rule of thumb, there are exceptions.

If you see places this rule of thumb is violated in Ops, feel free to point it out, and we can discuss whether the design could be improved there.

The situation gets nasty with nested generics, and incredibly nasty when the generics are recursive. A ? extends RealType<?> is effectively useless, because the compiler does not understand that those two wildcards actually must be the same. Think about the more common situation of ? extends Collection<?>—obviously those ? are two different placeholders. But for ? extends RealType<?> they actually must match due to the recursive declaration of T extends RealType<T>.

In short: never use ? extends RealType<?> or RealType<?>. Always use a type variable (e.g., T), even if you have to create an extra method signature to declare it.

5 Likes

Thanks for the responses. Very helpful.

Here is a small section of test code I wrote to illustrate your comments/scenarios for others that happen upon it. I think was confused me was that I thought printNumber2 (takes a SomeClass<SomeSubInterface> should have worked with a method that takes a SomeClass<SomeInterface> just as passing a SomeSubInterface can be passed for a SomeInterface object to a method.

    public static void tryArgs() 
    {
        Vector<Number> nums = new Vector<Number>();
        nums.add(new Double(0.0)); // Compiles
        printNumber(nums); // Compiles
        printNumber2(nums); // Compiles
        
        Vector<Double> doubles = new Vector<Double>();
        doubles.add(new Double(0.0));
        printNumber(doubles); // Compiles
        printNumber2(doubles); // Doesn't compile
        printDouble(doubles.get(0)); // Compiles
        
        Vector<? extends Double> extDoubles = new Vector<>();
        extDoubles.add(new Double(0.0)); // Doesn't compile
        printNumber(extDoubles); // Compiles
        printNumber2(extDoubles); // Doesn't compile
        printDouble(extDoubles.get(0)); // Compiles
    }
    
    public static void printNumber(Vector<? extends Number> nums)
    {
        System.out.println(nums);
    }
    
    public static void printNumber2(Vector<Number> nums)
    {
        System.out.println(nums);
    }
    
    public static void printDouble(Double d)
    {
        System.out.println(d);
    }
1 Like

Is this an exception to the rule here?.. imglib2 Polygon.getVertices() method that returns List<? extends RealLocalizable>. At first glance seems, it seems to break the “rule of thumb” but it appears necessary as this method returns access to the internal variable which can’t be cast to List<RealLocalizable>. My thought is that often I won’t be constructing polygons myself but be using Polygons that result from ops. Therefore, I don’t have the vertices / List<RealLocalizable> that was used to construct it (e.g., converting a LabelRegion to a Polygon). Thus, a return of Polygon from an Op becomes a tiny bit more cumbersome when I actually want the List<RealLocalizable>. I’m fine though if I just have to add the points to a new List<RealLocalizable> if I want to work with the list contents. I can also see wanting to keep the design this way for other reasons. Thought I would bring it up though as I ran across it.

Thanks,

Jay

Yeah, personally I think that class should copy the list of vertices from the input List<? extends RealLocalizable> into an internal List<RealLocalizable> via the addAll method (which does accept List<? extends E>, so it’s easy).

Then, the getVertices() accessor could return the internal list.

Alternately, the getVertices() accessor could do the copy every time—either way, I guess.

@tpietzsch @dietzc Do you guys agree that this would be a good change for the Polygon class? I think it would make the API nicer, while avoiding unintended mutation of the polygon.

Imagine the compiler allowed this. Here is a modified version of your example which demonstrates why that would wreak havoc:

public static void tryArgs() 
    {
        Vector<Double> doubles = new Vector<Double>();
        doubles.add(new Double(0.0));
        printNumber2(doubles); // Suppose this compiled
        Double fiftySeven = doubles.get(1); // Would throw ClassCastException!
    }
    
    public static void printNumber2(Vector<Number> nums)
    {
        System.out.println(nums);
        nums.add(57); // Integer, not a Double!
    }

For more details, see this answer on StackOverflow.

Ahah. I see. Thanks.

@ctrueden, @dietzc Depends on whether we want the user to be able to modify the list after constructing the Polygon. @dietzc, what do you think? In both cases, the question is whether we might even want to do a deep copy. E.g. if constructed with a List<RealPoint> after addAll() to the internal list, it is still possible to move vertices around. If getVertices() returns the internal list, it is even possible to add new vertices.

Not that my vote should matter much compared to you all but I suppose my preference would be to addAll upon construction and allow returning the internal ref.

It seems easy enough, want me to submit a PR to this effect?

Actually, I have no strong opinion whether we should allow post-modifications of the Polygon, but it feels somehow more correct (and less prone to errors) if we don’t allow modification of the Polygon after construction. One could always create a new Polygon if desired.

@tpietzsch A deep copy be made when creating the Polygon in the constructor, right? I don’t have an use-case in mind, but this might hurt runtime? But you are right, to be really, really safe we should do that.

@dietzc We can also do no deep copy for now and add it later if it ever causes problems. Then shallow copy in constructor and return unmodifiable list in getVertices()

That sounds easy and sufficient enough for now. At least this would allow returning an object where you can use .get(int) and get back a useful object. Returning an immutable list is fine with me. It doesn’t appear that it will likely ever be a class for modifying polygons, but interpreting them via contains etc. It would seem something more like a PointList would be a better and more general fit for modifying lists of points that could always be interpreted as a polygon using the Polygon class.

@dietzc Thanks for the comments on the pull request. I’m new the autoformatting game, at least the way you folks do it. How do I apply the style defined by the files in your link on the pull request? Any how-to’s out there?

I checked and it seemed that at least my user.name was set in my config already but potentially not my email. Hopefully that is set now. Was the email a crucial item. I would have though that the username would have been sufficient. This is in my ~/.gitconfig file on my mac. Am I doing something wrong?

Hi @jaywarrick,

the formatter can be found in imglib/doc as pointed out in the github thread (see http://stackoverflow.com/questions/17338283/import-a-user-defined-checkstyle-eclipse how to use it). Usually, all the projects have their own style settings (e.g. ImageJ has one, ImgLib2 has one, KNIME has one etc etc). This is really important to make the code more readable and consistent.

What you usually do if you edit an existing class, you simply apply the format after you have made your changes. this can be done in one commit, as your changes are new anyway. If you created a new class, it’s also no problem. The only problem arises when you edit a class which was not formatted correctly (for some reason), commit your changes + the formatting changes of the class. Then it’s really hard to understand for others when reviewing your code which changes were introduced by your commit (like the logical changes) and which changes come from the formatting. In this situation you either do a format commit first then your logic or the other way round.

Does this make sense?

Concerning the git & github: Just test it?

Best,

Christian

1 Like

Indeed. It should be enough to use git show (for single commit) or git log -p (for series of commits) and verify that all the details of the patches (author, committer, the patches themselves, etc.) look correct. If they are not correct: fix them before pushing!