[Astyle] Add Objective C to list of languages with formatters
ClosedPublic

Authored by rjvbb on Sep 15 2018, 12:20 PM.

Details

Summary

This improves support for ObjC and ObjC++ in the artistic style plugin, and makes those languages show up in the style selector.

Test Plan

Works as expected.

It can be discussed whether or not there need to be specific styles for ObjC (ObjC++), or whether its better just to use the style selected for "plain" C (C++). The ideal would be (IMHO) to apply the styles selected for C and/or C++ unless the user selected a dedicated style, but that might be overly complex to implement.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 3012
Build 3030: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
rjvbb added a comment.EditedSep 15 2018, 12:53 PM
Thanks. Learned about ObjC++ this way... the world is full of strange things :)

Depending on how you view C++ ObjC++ may be the best of both worlds or a horrible way to degrade a beautiful language :)

EDIT: one thing I keep noticing and forgot to mention is that indentation control doesn't work properly in ObjC files (even when you select C/C++ style in the configuration dialog). In particular, un-indentation doesn't work properly (for closing braces and for #preprocessor expressions).

kossebau accepted this revision.Sep 17 2018, 11:01 AM

If you have a chance/time, please considerupdating the patch by adding some samples for the new language types to AStylePlugin::previewText(SourceFormatterStyle, QMimeType).
Currently that one uses C++ code for the preview, which surely is a non-pleasing experience.

Given that the method is right now lacking proper samples for C, Java and C# as well, IMHO the patch as-is can still go in. But fixing this from the start would be good :)

This revision is now accepted and ready to land.Sep 17 2018, 11:01 AM
rjvbb added a comment.Sep 18 2018, 8:46 AM

On it, but the changes go a bit further than I (and you?) might have guessed ([private] method renaming e.g.). I hope to have something by the end of day, but if you think it'd end up having to be split into a separate commit I could just as well hear that now ;)

rjvbb updated this revision to Diff 41929.Sep 19 2018, 9:22 AM

I added ObjC previews as requested, and doing so I noticed I had to make a few more changes, justifying a ticket name change. I did mention the change was going to be a bit more elaborate than one might have thought, didn't I? :)

  • astyle 3.1 has a number of ObjC-related fixes so I updated the bundled copy. BTW I think that the plugin info blurb should refer to the project it's based on so I added the info. It should probably also mention the fact that the astyle library is under the MIT license (!), somehow.
  • I had to change the order in supportedMimeTypes(); without that ObjC files were handled as if they were (C) headerfiles, suggesting the returned list is scanned upstream against the mimetype at hand and the first match is applied. After changing the order things worked as expected so I didn't investigate further. I didn't notice any regressions.
  • I made the beginnings for a full initial implementation supporting other language previews, reusing the AStylePreferences::Language enum, currently just doing ObjC vs. the rest (CPP). I considered moving the Language enum, but thought it's not really better at its place in AStylePlugin than in AStylePreferences.
rjvbb retitled this revision from kdev-astyle : add ObjC mimetypes to kdev-astyle : improved ObjC support.Sep 19 2018, 9:25 AM
rjvbb edited the summary of this revision. (Show Details)
rjvbb set the repository for this revision to R32 KDevelop.

I added ObjC previews as requested, and doing so I noticed I had to make a few more changes, justifying a ticket name change. I did mention the change was going to be a bit more elaborate than one might have thought, didn't I? :)

Thanks for the work on updating the astyle copy, that should improve things in general, bringing more bugfixes and options.
Sadly I again have to ask you to split out that very aspect into a separate review request. This update not only affects ObjC, but also (most interesting here given current userbase) C & C++, so this needs some proper checking and testing, and independatnt handling of adding ObjC options with preiview to the style format selection.

Let us try to keep changes as small and self-contained as possible. This makes processing easier for the majority of the stakeholders involved (reviewers and future commit readers). Yes, more work for the author :) but in total it is less.

Yeah, I was expecting this, though I hoped it could just be separate commit, not a separate review.

Can the upgrade at least be done in the 5.3 branch, or if not, merged/cherry-picked ASAP after committing to the main branch?

Can the upgrade at least be done in the 5.3 branch, or if not, merged/cherry-picked ASAP after committing to the main branch?

IMHO this should be master-only, and we should rather concentrate on getting 5.3 out finally, and then having another 5.4 still this year. Over the years I have become a fan of rolling releases, at least for projects with low number of contributor resources.
Also, given that astyle changed some flags for 3.0 IIRC, there are surely some side-effects when it comes to custom scripts, so we would need to think about forward path for users. So nothing that should be rushed .

But maintainer's call.

rjvbb added a comment.Sep 19 2018, 1:04 PM

It's just that I use the 5.3 branch.

FWIW, 5.3.{1,2,3,...} is a rolling release scheme too, and even the ObjC changes represent a minor change only (almost a bugfix), not anyhing IMHO that would justify waiting for a bigger version bump.

rjvbb updated this revision to Diff 41944.EditedSep 19 2018, 6:08 PM

The libastyle upgrade was split off as D15605, as requested.
(Fortunately the 2 changesets were orthogonal.)

rjvbb set the repository for this revision to R32 KDevelop.Sep 19 2018, 6:08 PM

BTW, this change is still marked as approved, is it?

BTW, this change is still marked as approved, is it?

That was for the initial patch, without the sample text. Going to check the current version of this patch now, should report later tonight (optimistic on quick scan).

pino added a subscriber: pino.Sep 19 2018, 6:41 PM
pino added inline comments.
plugins/astyle/astyle_plugin.cpp
36

static const char foo[] = "...", which is more efficient (makes the whole string const)

71

ditto

103

ditto

156

ditto

kossebau accepted this revision.Sep 19 2018, 8:01 PM
kossebau added a subscriber: kfunk.

Code looks fine to me, modulo all the comments made, and works where I tested it. Please also update the summary text and title of the patch.

For some reason the preview uses the "C" highlighting mode for me when ObjC is selected, so the preview has some highlighting issues. That seems a bug with KTextEditor I have to explore more.

Not sure if this is 5.3 or master material. No new string inside, so I would be tempted to have this still with 5.3, given we are still before final release and it's a small innocent addition at the outer spheres, so should not introduce regressions. @kfunk Your take as maintainer?

plugins/astyle/astyle_plugin.cpp
327

While adding this line, already add a space between if and (.

344

Please keep this one single string calculation:

return
    QLatin1String("// Indentation\n") +
    indentingSample(lang) +
    QLatin1String("\t// Formatting\n") +
    formattingSample(lang);
plugins/astyle/astyle_plugin.h
28

Make first include, so plugin-specific includes come first

plugins/astyle/astyle_preferences.h
62

const Language m_currentLanguage;, to point out this is not going to change.

kossebau requested changes to this revision.Sep 19 2018, 8:24 PM

For some reason the preview uses the "C" highlighting mode for me when ObjC is selected, so the preview has some highlighting issues. That seems a bug with KTextEditor I have to explore more.

Nope, this is actually some bug in KDevelop which should be fixed first, somewhere in`SourceFormatterStyle::modeForMimetype(...)` the wrong mode is chosen. Still digging.

This revision now requires changes to proceed.Sep 19 2018, 8:24 PM
rjvbb added a comment.Sep 19 2018, 8:26 PM
Code looks fine to me, modulo all the comments made, and works where I tested it. Please also update the summary text and title of the patch.

What title do you propose?

For some reason the preview uses the "C" highlighting mode for me when ObjC is selected, so the preview has some highlighting issues. That seems a bug with KTextEditor I have to explore more.

I didn't notice that, it doesn't do that for me (KF5 Frameworks 5.47.0). What I do notice is that no preview is shown for profiles I defined myself, but that also happens for me without my patch.

kossebau accepted this revision.Sep 19 2018, 9:32 PM
Code looks fine to me, modulo all the comments made, and works where I tested it. Please also update the summary text and title of the patch.

What title do you propose?

Perhaps
[Astyle] Add Objective C to list of languages with formatters

For some reason the preview uses the "C" highlighting mode for me when ObjC is selected, so the preview has some highlighting issues. That seems a bug with KTextEditor I have to explore more.

I didn't notice that, it doesn't do that for me (KF5 Frameworks 5.47.0).

Possibly because Qt on macOS uses the macOS mimettype database (or the adaption to that specific format definition which there is on macOS IIRC), so things do not fail as here with the XDG shared-mime-info database. Where someone once decided that Objective-C is a subtype of C...
https://bugs.freedesktop.org/show_bug.cgi?id=6743
https://cgit.freedesktop.org/xdg/shared-mime-info/commit/freedesktop.org.xml.in?id=e1851ad681f64b5f5dde429e1a79801ef08311b5

No real clue about Objective-C, but the claim that Objective-C is a supertype to C seems wrong, no? That would need a fix in shared-mime-info then.

What I do notice is that no preview is shown for profiles I defined myself, but that also happens for me without my patch.

Yes, known (to me) bug, I have a patch somewhere I need to finish one day.

Concern about preview repulled, not due to this patch :) (possibly we could do some workaround, but I would first wait for someone complaining).

plugins/astyle/astyle_plugin.cpp
284

What about "ObjC" -> "Objective-C"? That is the id also used for X-KDevelop-Languages, would be good to be consistent.
Also is this the string currently used for display, so not using an abbreviation might be better.

This revision is now accepted and ready to land.Sep 19 2018, 9:32 PM
Possibly because Qt on macOS uses the macOS mimettype database

Actually I have mostly been working under Linux on this.

No real clue about Objective-C, but the claim that Objective-C is a supertype to C seems wrong, no? That would need a fix in shared-mime-info then.

No, ObjC is C with OO extensions, that's not wrong AFAIK.

I'm still using s-m-i 1.8 though, and I did make an addition to the shared-mime-info db for ObjC++ (lotsa translations snipped):

<?xml version="1.0" encoding="UTF-8"?>
<!--
It comes with ABSOLUTELY NO WARRANTY, to the extent permitted by law. You may
redistribute copies of update-mime-database under the terms of the GNU General
Public License. For more information about these matters, see the file named
COPYING.
-->
<!--
Notes:
- the mime types in this file are valid with the version 0.30 of the
  shared-mime-info package.
- the "fdo #xxxxx" are the wish in the freedesktop.org bug database to include
  the mime type there.
-->
<mime-info xmlns="http://www.freedesktop.org/standards/shared-mime-info">
  <mime-type type="text/x-objc++src">
    <comment>Objective-C++ source code</comment>
    <comment xml:lang="en_GB">Objective-C++ source code</comment>
    <sub-class-of type="text/x-objcsrc"/>
    <sub-class-of type="text/x-c++src"/>
    <magic priority="30">
      <match value="#import" type="string" offset="0"/>
    </magic>
    <glob pattern="*.mm"/>
  </mime-type>
</mime-info>
kfunk requested changes to this revision.Sep 20 2018, 6:51 AM

Please update the Diff according to our requests, then I think this can still go into 5.3 branch. Friedrich, you still agree this could go into 5.3? :)

plugins/astyle/astyle_plugin.cpp
377

No break; needed after return.

380

Add a Q_UNREACHABLE(); at the end of this function, as a small debugging aid.

This revision now requires changes to proceed.Sep 20 2018, 6:51 AM
rjvbb marked 11 inline comments as done.Sep 20 2018, 9:52 AM
rjvbb added inline comments.
plugins/astyle/astyle_plugin.cpp
36

And there I was thinking the days were gone where foo[] and *foo were ever so subtly different... :)

284

Thanks, that's a nice reminder there a few more places where changes are required and for which I have D15530 up!

327

I take it you meant fixing the whitespace for the entire method(?)

377

I know, I always add them anyway in my own code (just like I always add a default: with a break). Saves headscratching later on when you refactor and miss a missing break...

rjvbb updated this revision to Diff 41974.Sep 20 2018, 9:53 AM
rjvbb marked 4 inline comments as done.

Updated as requested, waiting for final a-ok.

rjvbb retitled this revision from kdev-astyle : improved ObjC support to [Astyle] Add Objective C to list of languages with formatters.Sep 20 2018, 9:54 AM
rjvbb edited the summary of this revision. (Show Details)
rjvbb set the repository for this revision to R32 KDevelop.
kossebau accepted this revision.Sep 20 2018, 10:52 AM

For some reason the preview uses the "C" highlighting mode for me when ObjC is selected, so the preview has some highlighting issues. That seems a bug with KTextEditor I have to explore more.

Nope, this is actually some bug in KDevelop which should be fixed first, somewhere in`SourceFormatterStyle::modeForMimetype(...)` the wrong mode is chosen. Still digging.

Seems that this needs some fixing in SourceFormatterStyle::modeForMimetype(), which is too eager picking any first style where mime.inherits(item.mimeType), which is then the "C" style given it comes first and shared-mime-info has this questionable inheritage for C++ and Objective-C from C. So nothing this patch here is guily for (now that it also uses ids matching the KSyntaxHighlighting mode ids).

Please update the Diff according to our requests, then I think this can still go into 5.3 branch. Friedrich, you still agree this could go into 5.3? :)

Not tested again the latest patch, but from code reading looks good. Can be considered a bug fix, so 5.3 okay with me. Also no string freeze break. So let's have it in 5.3, making more potential users happy earlier :)

plugins/astyle/astyle_plugin.cpp
36

For what I know, the days are still on-going due to this:

static const char * foo = "foo";

is a non-const pointer to a const char array. You can do (accidentally)

foo = "bar";

and the compiler will not see this runs against the (usual) intent.

It is also one indirection more (unless the compiler is sure enough to be able to optimize this out), as foo is a symbol for a data that is a pointer, which holds the address of the char array.
Where with

static const char foo[] = "foo";

foo is a symbol for a data that is the char array.

284

And as I meanwhile found, using "Objective-C" and "Objective-C++" here is also required, as this is by chance also the id used by the syntax highlighting mode for the respective highlighting rules.

327

Actually I only meant the lines you are touching, sorry for being unclear.

... shared-mime-info has this questionable inheritage for C++ and Objective-C from C.

I don't think that's so questionable; doesn't the C++ entry also inherit from C? That would be logical at least historically speaking (and properly written C can still be compiled as C++ AFAIK).

No, I think that as with all situations where inheritance is at play, one needs to work backwards, matching the most "evolved" choices first ... and cross fingers you don't run into too many multiple inheritance cases. That'd be the case for ObjC++. Then again, this is for the formatter only, fortunately, and the underlying library seems to identify the various C family members all by itself.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 20 2018, 12:12 PM
This revision was automatically updated to reflect the committed changes.
... shared-mime-info has this questionable inheritage for C++ and Objective-C from C.

I don't think that's so questionable; doesn't the C++ entry also inherit from C? That would be logical at least historically speaking (and properly written C can still be compiled as C++ AFAIK).

C is not a subset of C++ (let me point to Mr. Stroustrup who should know it :) https://web.archive.org/web/20080617183013/http://www.research.att.com/~bs/bs_faq.html#C-is-subset )

The inheritance in sthe shared-mime-info is actually done as a hacky workaround for pattern-based magic failing to properly detect the MIME type, see
https://bugs.freedesktop.org/show_bug.cgi?id=6743

No, I think that as with all situations where inheritance is at play, one needs to work backwards, matching the most "evolved" choices first ...

Agreed, that's what my current patch tries to do.

rjvbb added a comment.EditedSep 20 2018, 12:48 PM
C is not a subset of C++ (let me point to Mr. Stroustrup who should know it :)

I wouldn't call it that either (it'd be degrading for C ;)), but C++ does inherit from C (and was a layer on top of C in its early days, IIRC with a sort of preprocessor).

In fact, Mr Stroustrup does say basically the same thing:

Except for a few examples [...], C++ is a superset of C.

Yes, he also points out that C and C++ can also be siblings, depending on what version you're talking about, but that probably goes beyond the scope of mimetypes.

The inheritance in sthe shared-mime-info is actually done as a hacky workaround

Only? Not also to avoid code duplication?

Agreed, that's what my current patch tries to do.

Up for review already?

aaronpuchert added inline comments.
plugins/astyle/astyle_plugin.cpp
36

The advice is mostly important for exported symbols, see Ulrich Drepper: How To Write Shared Libraries, section 2.4. For static (internal linkage) variables the compiler knows all references to the variable, so it can remove the indirection and see that it is const (if it is).

Nevertheless, it obviously doesn't hurt to get used to writing char[], because sometimes it does matter.

By the way, even more modern would be static constexpr char[].

377

We could activate -Wunreachable-code so the compiler tells you where break isn't needed. With -Wfallthrough the compiler can detect unintended fallthroughs, so you wouldn't need to worry about a missing break either.

C is not a subset of C++ (let me point to Mr. Stroustrup who should know it :)

I wouldn't call it that either (it'd be degrading for C ;)), but C++ does inherit from C (and was a layer on top of C in its early days, IIRC with a sort of preprocessor).

In fact, Mr Stroustrup does say basically the same thing:

Except for a few examples [...], C++ is a superset of C.

Well, what matters to me is that C++ itself is not a strict(!) superset :) But actually this is also not that relevant here, as the actual problem is this: the shared-mime-info meaning of inherits is "is-a". E.g,

C inherits PlainText inherits Octetstream -> C is-a PlainText is-a Octetsteam. So e.g. any handlers (like viewers) of a given format can also handle files of a format that inherits from it.

C++ inherits C, as currently claimed in s-m-i, woule mean: C++ is-a C. Which would mean C++ is a subset of C. Which is the reverse of what we have talked about above, no? :)

Agreed, that's what my current patch tries to do.

Up for review already?

No, still trying variants, being unhappy with code amount.

kossebau added inline comments.Sep 21 2018, 4:16 AM
plugins/astyle/astyle_plugin.cpp
36

Not yet firm with constexpr myself, so curious to learn what would one gain from using static constexpr char[] here? What exactly would be buildtime evaluated?
I think I have seen constexpr already in KDevelop code, so if that is a better version, would be interested to improve all similar places.

rjvbb added a comment.Sep 21 2018, 8:53 AM

We could activate -Wunreachable-code so the compiler tells you where break isn't needed.

Evidently I was talking about a general principle, not about introducing a dependency on the compiler telling you what (not) to do. If I wanted that I'd be coding in Modula-2 ...

C inherits PlainText inherits Octetstream -> C is-a PlainText is-a
Octetsteam. So e.g. any handlers (like viewers) of a given format can also
handle files of a format that inherits from it.

But those viewers cannot be expected to handle all specifics of a further evolved format, can they now? Despite all its differences C++ still has a lot in common with C, and formatting it as C is better than formatting as plain text, no?
I know the s-m-i entries talk about "subclass-of" (which to me in fact suggests the opposite, i.e. subset of) but Qt translates that to inheritance which I think is fine. And describes the situation exactly; C++ and ObjC aren't called D++ and ObjD for a reason after all ;)

We could activate -Wunreachable-code so the compiler tells you where break isn't needed.

Evidently I was talking about a general principle, not about introducing a dependency on the compiler telling you what (not) to do. If I wanted that I'd be coding in Modula-2 ...

Eventually we have to compile our code anyway, so aren't we all dependent on the compiler? I'm not sure why you're bringing Modula-2 into the discussion, but C++ is very much about compile-time analysis, in fact I'd say it's one of the main reasons to use it. (Also because it's fast, but that has something to do with compile-time analysis as well.)

With KDevelop's nice integration of compiler warnings it usually doesn't hurt to turn on -Weverything. It helps me see a lot of issues that I'd probably overlook otherwise.

If you don't want your compiler to tell you what (not) to do, use a language that isn't compiled.

plugins/astyle/astyle_plugin.cpp
36

Basically constexpr forces compile-time evaluation, which means that

  1. The variable will (almost certainly) land in a read-only segment.
  2. There is no dynamic initialization order fiasco.

In this case it wouldn't make much of a difference, since it is a compile-time constant and the compiler will treat is as such. But with constexpr you get a compilation error if it isn't.

rjvbb added a comment.Sep 22 2018, 8:14 AM
Eventually we have to compile our code anyway, so aren't we all dependent on the compiler?

Not what I meant of course. I don't need the compiler to tell me where a break isn't needed; I prefer to remind myself of such things and that I'm putting it there anyway (because the compiler won't be able to tell me I really forgot to use one, not unless you make some kind of "no break here" statement obligatory). In a sense this really isn't that different from wondering why you need to add a closing brace after a return...
Remember the old advice of always adding a default case to a switch, and putting a break in that one too (which isn't required either). I did notice the other day that kdevelop's build system now adds a flag that causes a compiler error when a switch statement is incomplete and/or lacks a default case.

I'm not sure why you're bringing Modula-2 into the discussion

Because it's a lot stricter than C++ and at the same time a whole lot simpler. Yet even that language (like Pascal) doesn't require to put a closing semicolon in the last position where it would be required; I always put them anyway (and had a colleague who loved removing them).

My attitude towards this sort of thing may depend on the fact that my native language is Dutch (which has a grammar that's basically defined by exceptions) but that's a discussion for another place and time :)

With KDevelop's nice integration of compiler warnings it usually doesn't hurt to turn on `-Weverything`. It helps me see a lot of issues that I'd probably overlook otherwise.

Honestly, I've seen better and the more warnings you get the less usable the contextual parser becomes because it can no longer display its information (all the more annoying if you need that information to correct an error). That contextual parser ("KIntelliSense" ^^) is among the main reasons that got me hooked on KDevelop btw.

I also still spend a lot of time coding in vi.

If you don't want your compiler to tell you what (not) to do, use a language that isn't compiled.

Not necessary, just use C O:-)

I prefer to remind myself of such things and that I'm putting it there anyway (because the compiler won't be able to tell me I really forgot to use one, not unless you make some kind of "no break here" statement obligatory).

In a way that is what is slowly happening. With C++17 there will be a standardized attribute [[fallthrough]] and I suspect that -Wfallthrough will become a standard warning like -Wreturn-type by then. (The latter warns if some code paths in a function don't return a value, as in int f() {}.) As for Qt, there is of course Q_FALLTHROUGH() in qcompilerdetection.h, which chooses one of [[{,gnu::,clang::}fallthrough]], depending on which is available.

With that attribute fallthroughs can be annotated, so the compiler can easily warn if a break statement is missing. When there is a false positive, it can easily be fixed by adding the attribute.

rjvbb added a comment.Sep 22 2018, 7:05 PM

I have no (big) problems with a fallthrough attribute, but if it's going to be obligatory I'd want the complementary attribute (the break) to be obligatory too...

A warning for a missing return value is a lot more useful IMHO; bugs because of that are a lot more difficult to figure out in my experience.

I have no (big) problems with a fallthrough attribute, but if it's going to be obligatory I'd want the complementary attribute (the break) to be obligatory too...

Please have a look at D15694. With that change, every case block has to end either with a terminator statement (break, continue, return, throw, ...) or with a fallthrough attribute.

Thanks. Learned about ObjC++ this way... the world is full of strange things :)

Though as I now realize, by giving the current kdevelop runtime log more attention, the rest of the world yet has to learn about those strange things:

https://cgit.freedesktop.org/xdg/shared-mime-info/tree/freedesktop.org.xml.in has no mention of text/x-objc++src or text/x-objchdr.

@rjvbb Please consider picking up your work with https://bugs.freedesktop.org/show_bug.cgi?id=98823
And ideally also do a similar patch for "text/x-objchdr". Though trying to find any reference, is that MIME type really in use out there?

Those warnings in the log are annoying, and we should not have mentions of MIME types which are no-where defined in public.

I would like to eliminate the following warnings from KDevelop's output:

kdevplatform.shell: formatter plugin "kdevastyle" supports unknown mimetype entry "text/x-objc++src"
kdevplatform.shell: formatter plugin "kdevastyle" supports unknown mimetype entry "text/x-objchdr"
kdevplatform.shell: formatter plugin "kdevcustomscript" supports unknown mimetype entry "text/x-objc++src"
kdevplatform.shell: formatter plugin "kdevcustomscript" supports unknown mimetype entry "text/x-objchdr"

I am going to contribute Objective-C++ source MIME type to https://gitlab.freedesktop.org/xdg/shared-mime-info. But I don't have access to macOS, so need some information. @rjvbb, could you please try to answer the following questions?

  1. KDevelop and QtCreator use text/x-objc++src as the MIME type name. But KSyntaxHighlighting refers to text/x-objcpp-src. Is text/x-objc++src the canonical name used in macOS or is there some other reason to select one of these names?
  2. The Objective-C Wikipedia article and the KSyntaxHighlighting's data file linked-to above claim that *.M is one of the file extensions of Objective-C++ source files. But since macOS's filesystem is case-insensitive by default, I suppose *.mm is used almost universally and *.M practically never. Is that right? If so, there is no need to complicate the MIME database with 3 case-sensitive attributes.
  3. Currently text/x-objcsrc has only one magic pattern: "#import". My brief research indicates that many Objective-C and Objective-C++ files start with comments or #include. Is that right? If so, I would like to add the following magic to these MIME types with a priority lower than text/x-csrc's to facilitate distinguishing them from text/x-matlab and text/x-troff-mm MIME types with file extensions *.m and *.mm respectively:
+      <match type="string" value="/*" offset="0"/>
+      <match type="string" value="//" offset="0"/>
+      <match type="string" value="#include" offset="0"/>
  1. My Internet searches indicate that text/x-objchdr MIME type is very rarely used. Qt Creator doesn't mention it. KSyntaxHighlighting refers to text/x-c-hdr in data/syntax/objectivec[pp].xml. This KCoreAddons commit removed it as useless. Indeed, it shares the extension *.h with C. Not even C++ competes for this file extension in the MIME type database, even though a large proportion of *.h files are C++ headers. So I doubt the MIME type DB maintainers would agree to risk issues in C header detection to accommodate this rare case. Theoretically, Objective-C headers could be detected with #import magic, but that wouldn't be reliable, because Objective-C headers can start with a comment or with #include, which share C's syntax. So I am inclined to remove the references to text/x-objchdr from KDevelop. If I understand correctly, this MIME type is currently useless in KDevelop anyway, because it is only included in MIME type lists and never appears in conditions. Is text/x-objchdr used in macOS? If yes, how is it distinguished from text/x-chdr there?

If no one has objections or suggestions on this topic, I am going to create a shared-mime-info merge request tomorrow: this patch plus tests

My best judgment on the questions above:

  1. Neither of the names is widely used. The text/x-objc++src name is consistent with the existing names text/x-c++src and text/x-objcsrc. So I am going to use it, and once merged, create the renaming KSyntaxHighlighting merge request.
  2. Haven't seen any *.M files, so won't use this extension.
  3. No longer relevant: I found out that for two MIME types to share common magic elements, MIME type inheritance is a more elegant and the only working approach.
  4. Will create a KDevelop merge request that removes all mentions of text/x-objchdr.
rjvbb added a comment.Oct 11 2021, 5:22 PM

Igor Kushnir wrote on 20211011::12:38:18 re: "D15532: [Astyle] Add Objective C to list of languages with formatters"

  1. Will create a KDevelop merge request that removes all mentions of text/x-objchdr.

I already indicated why I think that's not a good idea (but I can always just patch it back)

  1. Will create a KDevelop merge request that removes all mentions of text/x-objchdr.

I already indicated why I think that's not a good idea (but I can always just patch it back)

Where did you indicate that? Are Objective-C headers distinguished from C headers on your system? If yes, then how?

rjvbb added a comment.Oct 11 2021, 6:04 PM

Igor Kushnir wrote on 20211011::17:30:08 re: "D15532: [Astyle] Add Objective C to list of languages with formatters"

Where did you indicate that? Are Objective-C headers distinguished from C headers on your system? If yes, then how?

In the reply I thought I sent yesterday (it got held up for moderation for the kdevelop-devel ML; I assumed it was delivered to the other "reply-all" destinations):

René J.V. Bertin wrote on 20211009::22:00:04 re: "Re: D15532: [Astyle] Add Objective C to list of languages with formatters"

pino removed a subscriber: pino.Oct 11 2021, 6:05 PM

In the reply I thought I sent yesterday (it got held up for moderation for the kdevelop-devel ML; I assumed it was delivered to the other "reply-all" destinations):

René J.V. Bertin wrote on 20211009::22:00:04 re: "Re: D15532: [Astyle] Add Objective C to list of languages with formatters"

Could you please cancel that email and paste its contents in a Phabricator comment?

rjvbb added a comment.Oct 12 2021, 8:53 AM

Could you please cancel that email and paste its contents in a Phabricator comment?

[grrr, replying via email didn't use to suck this much AFAICR]

On Saturday October 09 2021 17:25:14 Igor Kushnir wrote:

@rjvbb, could you please try to answer the following questions?

Is text/x-objc++src the canonical name used in macOS or is there some other reason to select one of these names?

No idea to be honest, nor how I'd figure that out...

I suppose //*.mm// is used almost universally and //*.M// practically never. Is that right? If so, there is no need to complicate the MIME database with 3 //case-sensitive// attributes.

I can indeed not remember ever having seen a .M file. That said, HFS and APFS are case insensitive but case preserving, so software can and does use filename case. I think that you should take it up with the mimeinfo maintainers what they prefer here.

  1. Currently text/x-objcsrc has only one magic pattern: "#import". My brief research indicates that many Objective-C and Objective-C++ files start with comments or #include. Is that right?

They certainly can (just like C and C++ can use #import, at least as far as GCC and Clang are concerned). A more distinguishing test would be for the presence of an @ symbol at the start of a line (@class, @interface, @implementation, @end etc), but those don't have to be present in a file either for it to be valid ObjC.
I would add #import though; Apple code will typically use that.

Theoretically, Objective-C headers could be detected with #import magic, but that wouldn't be reliable, because Objective-C headers can start with a comment or with #include, which share C's syntax. So I am inclined to remove the references to text/x-objchdr from KDevelop.

I'm pretty certain I have a patch that improves ObjC support in KDevelop, and also that I put it up for review at some point. Maybe it was never incorporated because of the mime info not being in the mimeinfo database, I can't remember.

If I understand correctly, this MIME type is currently useless in KDevelop anyway, because it is only included in MIME type lists and never appears in conditions. Is text/x-objchdr used in macOS? If yes, how is it distinguished from text/x-chdr there?

Again, I can't answer that question. Since Mac OS X the Mac OS chiefly uses file extensions rather than the old type and creator 4char codes that were stored in the resource fork. I rarely use the Finder for filesystem navigation but I'll try to remember to look how it labels, erm, *C* header files.
KDevelop would do good to distinguish ObjC(++) from C(++) header files so it doesn't signal spurious errors when editing one, but it might be able to do that by installing a complementary mimeinfo module that isn't part of the official database.

ObjC headers will typically contain the @interface keyword, but that can of course occur quite far into the file. Idem for @class.

FWIW: once you do get KDevelop to recognise ObjC correctly, the memory footprint of the parser increases enormously as soon as it encounters such a file. I think the compilation of ObjC files also requires oodles of RAM, but that is released as soon as the compilation is done; KDevelop's parser doesn't release its memory so easily (in any case not when you simply close the file).

igorkushnir added a comment.EditedOct 12 2021, 9:58 AM

Theoretically, Objective-C headers could be detected with #import magic, but that wouldn't be reliable, because Objective-C headers can start with a comment or with #include, which share C's syntax. So I am inclined to remove the references to text/x-objchdr from KDevelop.

I'm pretty certain I have a patch that improves ObjC support in KDevelop, and also that I put it up for review at some point. Maybe it was never incorporated because of the mime info not being in the mimeinfo database, I can't remember.

Here is your patch: D15530. One of your comments in that review is:

Removed the x-objchdr mimetype (to avoid additional confusion what a .h file could be) and added my current x-objc+src definition to kdevclang.xml .

This comment sort of resigns to remove mentions of x-objchdr from KDevelop's code.

If I understand correctly, this MIME type is currently useless in KDevelop anyway, because it is only included in MIME type lists and never appears in conditions. Is text/x-objchdr used in macOS? If yes, how is it distinguished from text/x-chdr there?

Again, I can't answer that question. Since Mac OS X the Mac OS chiefly uses file extensions rather than the old type and creator 4char codes that were stored in the resource fork. I rarely use the Finder for filesystem navigation but I'll try to remember to look how it labels, erm, *C* header files.
KDevelop would do good to distinguish ObjC(++) from C(++) header files so it doesn't signal spurious errors when editing one, but it might be able to do that by installing a complementary mimeinfo module that isn't part of the official database.

ObjC headers will typically contain the @interface keyword, but that can of course occur quite far into the file. Idem for @class.

FWIW: once you do get KDevelop to recognise ObjC correctly, the memory footprint of the parser increases enormously as soon as it encounters such a file. I think the compilation of ObjC files also requires oodles of RAM, but that is released as soon as the compilation is done; KDevelop's parser doesn't release its memory so easily (in any case not when you simply close the file).

So we don't know a reliable way to tell Objective-C headers from C headers. The dropping of text/x-objchdr from KCoreAddons indicates that this isn't easy as well.

  1. Will create a KDevelop merge request that removes all mentions of text/x-objchdr.

I already indicated why I think that's not a good idea (but I can always just patch it back)

As KDevelop doesn't recognize any *.h files as Objective-C headers on your system, I don't understand what you gain from the presence of the lines {QStringLiteral("text/x-objchdr"), QStringLiteral("Objective-C")}, in KDevelop's code and why would you want to patch them back in. Do you think that detecting only the headers that start with #import as Objective-C headers would be helpful? Even if it were helpful, I don't know how to achieve that, because having two MIME types with the same magic elements (text/x-objcsrc and text/x-objchdr) wouldn't work. And neither of these two MIME types should inherit the other, right?

EDIT: I see now that text/x-chdr inherits text/x-csrc, so text/x-objchdr can inherit text/x-objcsrc. But adding another MIME type with *.h pattern to the shared or KDevelop-custom MIME database would substantially slow down MIME type detection of all header files, because their contents would have to be read for magic matching. See e.g. Qt's MIME detection algorithm. Does unreliable detection of Objective-C headers justify this performance hit?

Is text/x-objc++src the canonical name used in macOS or is there some other reason to select one of these names?

No idea to be honest, nor how I'd figure that out...

A follow-up question then: does Qt use its bundled MIME type database on macOS? If yes, then macOS probably doesn't use such MIME type names.

If I understand correctly, this MIME type is currently useless in KDevelop anyway, because it is only included in MIME type lists and never appears in conditions. Is text/x-objchdr used in macOS? If yes, how is it distinguished from text/x-chdr there?

Again, I can't answer that question. Since Mac OS X the Mac OS chiefly uses file extensions rather than the old type and creator 4char codes that were stored in the resource fork. I rarely use the Finder for filesystem navigation but I'll try to remember to look how it labels, erm, *C* header files.
KDevelop would do good to distinguish ObjC(++) from C(++) header files so it doesn't signal spurious errors when editing one, but it might be able to do that by installing a complementary mimeinfo module that isn't part of the official database.

ObjC headers will typically contain the @interface keyword, but that can of course occur quite far into the file. Idem for @class.

Yes, it is interesting if Finder ever labels *.h files as Objective-C headers - when they start with #import, @interface or @class?

The Compiled Programming Language Sources section of Apple's System Declared UTI documentation contains cHeader, cSource, cPlusPlusHeader, cPlusPlusSource, objectiveCPlusPlusSource and objectiveCSource, but no objectiveC[PlusPlus]Header.

Well, that's what I get for trusting my memory, I didn't even realise at first that you'd been jumping on a PR that I started. :-/

Just carry on then, forget my objections about the headers.

As to Qt using shared mimeinfo on Mac, I couldn't tell for certain. I basically only use the MacPorts build of Qt, which as you're probably aware is an environment that aims to do things in a linuxy way. You asked about Qt's bundled mime info database, and I have to assume they do use that, esp. if/because shared-mimeinfo isn't available typically.

I strongly doubt that the Finder would analyse file contents in order to give a type indication. But it could easily use metadata provided by Xcode in the direntry and/or by Spotlight (which *does* analyse file content). As I said before, I rarely use the Finder so I'm far from an expert on its quirks and features. In fact, I'm probably not an expert on the Mac GUI at all, it's been so long that I've been using mine as a sort of Franke(li)nux - the KDevelop session I use most runs under X11 even ;)

Created the text/x-objc++src merge request here: https://gitlab.freedesktop.org/xdg/shared-mime-info/-/merge_requests/158. But no one has even commented on it in 1.5 months. The maintainers of shared-mime-info don't review merge requests regularly lately.

Created the text/x-objchdr removal merge request here: https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/314.