If this sounds like a gtk-doc defamation, I apologize. However, if I thouhgt everything is all right with gtk-doc, I would have never started writing yagdoc, so this essentially is a list of thing to do differently. The overall structure is very similar to gtk-doc (one reason for this is of course the intended gtk-doc compatibility):
Output means DocBook XML source here, that has to be further processed in order to obtain something human-friendly. The possibility to generate some kinds of output directly from the parsed declarations and documentation should be also considered.
Does the list of Gtk+ signal argument names belong to gtk-doc? Or the
map of event signals to the particular GdkEvent
subtypes they
use? Is it necessary to add an exception to exclude
gnome_keyring_item_info_get_type()
to gtk-doc code?
The goal is not to deprive Gtk+ of nicely presented signal arguments and their types. But why other libraries could not have their signal arguments and their types nicely presented? And why should the addition of new signals with arguments to Gtk+ require gtk-doc updates?
Step 1 is of course to avoid non-generic infromation creep. Step 2 is then to implement mechanisms that enable all projects easily (pre|post)process the information and insert hooks where necessary.
The worst maintenance nightmare are gettext translations, gtk-doc documentation is close second. It is necessary to strictly separate computer-generated and manually written data.
Having lots of optinally existing files that something sometimes attempts to create (having to guess in which directory) if they don't exist, this makes writing sensible Makefile rules hard. Keep the number of auxiliary files small, and if they appear in Makefile dependencies, ensure they exist after bootstrapping.
In particular (some of the following is implemented in recent gtk-doc too, often by me):
--rebuild-sections
mode, more control over sorting declarations
into sections will be possible with a set of regular-expression based rules.
To split garray.h
declarations into GArray
,
GPtrArray
and GByteArray
documentation about
2 × 3 = 6 rules will be likely needed.--rebuild-types
will be the only behaviour.After several years of development and many bug reports, gtk-doc still has
difficulties with basic C syntax: nested structures, sensitivity to line
breaks and other ignorables, recognizing unsigned long
as
a type, forward declarations of enums, …. A different approach is evidently
necessary.
While the standard Gnome/Gtk+/GLib code idioms must be recognized and supported, it should not really matter at which point one breaks lines in function protoypes. Regular expressions alone are not sufficient to parse a reasonably large subset of C, a standard recursive parser should be used instead. (Another facet of this issue is how the documentation of complex nested structures should be written and presented.)
In addition, the user should be able to easily teach yagdoc his local
variations and conventions: constructs, that should be ignored, that can
stand for const
or extern
.
We have the luxury of object serialization Just Working in Python. We also have the actual token lists and parse trees. Analysing the same text fragments again and again in different places – with possibly slightly different regular expressions – as happens in gtk-doc is then avoidable and has to be avoided.
A small downside of the use of serialization is that the rough equivalent
of foo-decl.txt
(containing more information than that though)
will not be human readable. It is not meant to be human-edited and automated
changes should be done by extending the build process with Python code. So
the only uses are: tools that read it and humans wishing to look at it.
Tools written in Python are encouraged to just deserialize it, other tools
and humans can be served by a “pretty printer” that extracts information in
text form. In fact, various dumpers are being aleady written for
visualization of the in-core representation.
The mechanism of defining variables in the Makefile, including another
Makefile that tries to do something reasonable with them and passing dozens
of options to the various tools has its limitations. It also makes the
output silently dependent (i.e. this dependency does not appear in the
Makefile rules) on the Makefile. Many people resign and customize
gtk-doc.make
directly.
While there's nothing wrong with such customization, we should be able to do better, and since the border between configuration and extension is fuzzy, use one mechanism for both.
configure
-determined parameters. It would be possible to turn
the yagdoc configuration file into a configure
template
(.in
), but that would be ugly. A better approach is to generate
a file imported to the main configuration file. It is also possible to
extract variable values from the Makefile in yagdoc.All assumptions (as oposed to hard facts, for instance about the C language) should be put into data instead of hardcoding them deep in some four-line regular expressions. And such data should be modifiable in the configuration file.
Documentation generation stages should be factored into overridable parts and/or allow user hooks at suitable places.
Many extensions have the form of new source code markup or documentation tags that are then processed somehow. While the processing will typically require to write some real code, the recognition code should not require any changes. For instance, noting that symbols are deprecated according to their occurence inside certain preprocessor conditionals is a mechanism that is equally hard to implement specifically for deprecation and for general preprocessor blocks.
Every failure that is not due to yagdoc bugs (or system failures), has its origin in the source files. We might not be able to pinpoint the primary cause but we can always point to the line where we noticed things are not right. Not emitting some standard error messages (for instance GCC-like) is a mortal sin.
Also, an “advantage” of Python against Perl is that essentially everything that gets wrong raises an exception and unbounded exceptions are fatal. The code just has to be right and handle bogus input with grace.
The parse-or-perish approach perhaps requires more effort, and sometimes unfortunately cooperation from the user, on the other hand, he will not need to wonder why his declarations are not picked up. Either they are, or they produce errors.
See Gtk-doc Future for some interesting ideas. The key word is some: there are features that yagdoc naturally implements because that's the way it should work from the begining, some interesting extension suggestion, and cases of – probably bad weed you'd rather quit smoking.