Add image access API to Poppler Qt bindings
Open, NormalPublic

Description

As per discussion with Albert during Akademy.

The direct use of (private) Poppler API makes us very unpopular with distributions and David. However the Qt API we are supposed to use doesn't offer access to images yet.

Therefore add QList<ImageBox*> Page::imageList() const upstream, using more or less the same approach we use in kitinerary once T8813 is done.

vkrause created this task.Aug 19 2018, 8:00 AM
vkrause triaged this task as Normal priority.
vkrause added a subscriber: aacid.Aug 24 2018, 3:05 PM

With T8813 implemented and working, next is finalizing the API for this.

@aacid, does this roughly match what you had in mind?

class ImageBox {
public:
    // source image without transform applied, loaded on demand
    QImage image() const;
    // source size, before transform is applied
    int width() const;
    int height() const;
    QSize size() const;
    // CTM
    QTransform transform() const;
    // TODO: num/gen to identified repeated images that are painted multiple times, is there a standard way to expose this already?
};

class Page {
public:
    ...
    // caller takes ownership
    QList<ImageBox*> images() const;
    QList<ImageBox*> images(QRectF) const;
};
  • Should we keep the heap allocated class pattern rather than value types, for consistency?
  • Should this go in poppler-qt5.h or separate poppler-image.h header? Not sure what the preferred style is there.

Sounds about right i'd say

About

// TODO: num/gen to identified repeated images that are painted multiple times, is there a standard way to expose this already?

Those shouldn't be returned in those methods since ImageBox doesn't have a position, or you should add the position info, i'm sure people would like that (actually i think we have some todo in okular from old times before using poppler)

Should we keep the heap allocated class pattern rather than value types, for consistency?

yeah, let's do that

Should this go in poppler-qt5.h or separate poppler-image.h header? Not sure what the preferred style is there.

I've no clue :D We don't really have a style, I'd say let's add it to poppler-qt5

Thanks for the feedback!

In T9445#157053, @aacid wrote:

About

// TODO: num/gen to identified repeated images that are painted multiple times, is there a standard way to expose this already?

Those shouldn't be returned in those methods since ImageBox doesn't have a position, or you should add the position info, i'm sure people would like that (actually i think we have some todo in okular from old times before using poppler)

The position is provided as part of the transform, but having a more convenient API for it would probably make sense. For symmetry this would probably need to include the transformed size too though?

// option 1
QRectF targetRect() const;
// option 2
QPointF position() const;
QSizeF transformedSize() const;
// option 3
double x() const;
double y() const;
double transformedWidth() const;
double transformedHeight() const;

Which approach would you prefer? target or transformed as the prefix? QPainter seems to use target in parameter names for this.

And then something like:

int objectNumber() const;
int generationNumber() const;

Slightly uneasy about that one, as this seems to be exposed nowhere else in the Qt5 API yet. Hm, or maybe rather something like bool ImageBox::hasSameImageContent(ImageBox *other) const? That doesn't allow hashing (and therefore use of a QSet) though.

aacid added a comment.Aug 28 2018, 9:22 PM

option 1 sounds interestingly compact, i guess with a nice documention people will understand it

You don't really need the generation number since we won't we returning "old" versions of the same object, so return only the object number, and then you can go away from "pdf specifics" if you don't feel confortable and just call it
int id() const;

And then add as extra in the documentation that it's the pdf object number