Developer Interface¶
This documentation covers the public interfaces fedmsg provides. Unless otherwise noted, all documented interfaces follow Semantic Versioning 2.0.0. If the interface you depend on is not documented here, it may change without warning in a minor release.
Python¶
Sending and Receiving Messages¶
Federated Message Bus Client API
-
fedmsg.
init
(**kw)[source]¶ Initialize an instance of
fedmsg.core.FedMsgContext
.The config is loaded with
fedmsg.config.load_config()
and updated by any keyword arguments. This config is used to initialize the context object.The object is stored in a thread local as
fedmsg.__local.__context
.
-
fedmsg.
publish
(*args, **kw)[source]¶ Send a message over the publishing zeromq socket.
>>> import fedmsg >>> fedmsg.publish(topic='testing', modname='test', msg={ ... 'test': "Hello World", ... })
The above snippet will send the message
'{test: "Hello World"}'
over the<topic_prefix>.dev.test.testing
topic. The fully qualified topic of a message is constructed out of the following pieces:This function (and other API functions) do a little bit more heavy lifting than they let on. If the “zeromq context” is not yet initialized,
fedmsg.init()
is called to construct it and store it asfedmsg.__local.__context
before anything else is done.An example from Fedora Tagger – SQLAlchemy encoding
Here’s an example from fedora-tagger that sends the information about a new tag over
org.fedoraproject.{dev,stg,prod}.fedoratagger.tag.update
:>>> import fedmsg >>> fedmsg.publish(topic='tag.update', msg={ ... 'user': user, ... 'tag': tag, ... })
Note that the tag and user objects are SQLAlchemy objects defined by tagger. They both have
.__json__()
methods whichfedmsg.publish()
uses to encode both objects as stringified JSON for you. Under the hood, specifically,.publish
usesfedmsg.encoding
to do this.fedmsg
has also guessed the module name (modname
) of it’s caller and inserted it into the topic for you. The code from which we stole the above snippet lives infedoratagger.controllers.root
.fedmsg
figured that out and stripped it down to justfedoratagger
for the final topic oforg.fedoraproject.{dev,stg,prod}.fedoratagger.tag.update
.Shell Usage
You could also use the
fedmsg-logger
from a shell script like so:$ echo "Hello, world." | fedmsg-logger --topic testing $ echo '{"foo": "bar"}' | fedmsg-logger --json-input
Parameters: - topic (unicode) – The message topic suffix. This suffix is joined to the
configured topic prefix (e.g.
org.fedoraproject
), environment (e.g.prod
,dev
, etc.), and modname. - msg (dict) – A message to publish. This message will be JSON-encoded prior to being sent, so the object must be composed of JSON- serializable data types. Please note that if this is already a string JSON serialization will be applied to that string.
- modname (unicode) – The module name that is publishing the message. If this
is omitted,
fedmsg
will try to guess the name of the module that called it and use that to produce an intelligent topic. Specifyingmodname
explicitly overrides this behavior. - pre_fire_hook (function) – A callable that will be called with a single argument – the dict of the constructed message – just before it is handed off to ZeroMQ for publication.
- topic (unicode) – The message topic suffix. This suffix is joined to the
configured topic prefix (e.g.
Configuration¶
For the list of configuration options, see the Configuration documentation.
-
class
fedmsg.config.
FedmsgConfig
[source]¶ The fedmsg configuration dictionary.
To access the actual configuration, use the
conf
instance of this class.
-
fedmsg.config.
conf
= {}¶ The fedmsg configuration dictionary. All valid configuration keys are guaranteed to be in the dictionary and to have a valid value. This dictionary should not be mutated. This is meant to replace the old
load_config()
API, but is not backwards-compatible with it.
-
fedmsg.config.
load_config
(extra_args=None, doc=None, filenames=None, invalidate_cache=False, fedmsg_command=False, disable_defaults=False)[source]¶ Setup a runtime config dict by integrating the following sources (ordered by precedence):
- defaults (unless disable_defaults = True)
- config file
- command line arguments
If the
fedmsg_command
argument is False, no command line arguments are checked.
-
fedmsg.config.
build_parser
(declared_args, doc, config=None, prog=None)[source]¶ Return the global
argparse.ArgumentParser
used by all fedmsg commands.Extra arguments can be supplied with the declared_args argument.
Cryptography and Message Signing¶
fedmsg.crypto
- Cryptographic component of fedmsg.
Introduction¶
In general, we assume that ‘everything on the bus is public’. Even though all
the zmq endpoints are firewalled off from the outside world with iptables, we
do have a forwarding service setup that indiscriminantly
forwards all messages to anyone who wants them.
(See fedmsg.commands.gateway.gateway
for that service.)
So, the issue is not encrypting messages so they can’t be read. It is up to
sensitive services like FAS to not send sensitive information in the first
place (like passwords, for instance).
However, since at some point, services will respond to and act on messages that come across the bus, we need facilities for guaranteeing a message comes from where it ought to come from. (Tangentially, message consumers need a simple way to declare where they expect their messages to come from and have the filtering and validation handled for them).
There should also be a convenient way to turn crypto off both globally and locally. Justification: a developer may want to work out a bug without any messages being signed or validated. In production, certain senders may send non-critical data from a corner of Fedora Infrastructure in which it’s difficult to sign messages. A consumer of those messages should be allowed to ignore validation for those and only those expected unsigned messages
Two backend methods are available to accomplish this:
fedmsg.crypto.x509
fedmsg.crypto.gpg
Which backend is used is configured by the crypto_backend configuration value.
Certificates¶
To accomplish message signing, fedmsg must be able to read certificates and a
private key on disk in the case of the fedmsg.crypto.x509
backend
or to read public and private GnuPG keys in the came of the
fedmsg.crypto.gpg
backend. For message validation, it only need be
able to read the x509 certificate or gpg public key. Exactly which
certificates are used are determined by looking up the certname
in the
certnames config dict.
We use a large number of certs for the deployment of fedmsg. We have one cert per service-host. For example, if we have 3 fedmsg-enabled services and each service runs on 10 hosts, then we have 30 unique certificate/key pairs in all.
The intent is to create difficulty for attackers. If a low-security service on a particular box is compromised, we don’t want the attacker automatically have access to the same certificate used for signing high-security service messages.
Furthermore, attempts are made at the sysadmin-level to ensure that fedmsg-enabled services run as users that have exclusive read access to their own keys. See the Fedora Infrastructure SOP for more information (including how to generate new certs/bring up new services).
Routing Policy¶
Messages are also checked to see if the name of the certificate they bear and the topic they’re routed on match up in a routing_policy dict. Is the build server allowed to send messages about wiki updates? Not if the routing policy has anything to say about it.
Note
By analogy, “signature validation is to authentication as routing policy checks are to authorization.”
If the topic of a message appears in the routing_policy, the name borne on the certificate must also appear under the associated list of permitted publishers or the message is marked invalid.
If the topic of a message does not appear in the routing_policy, two different courses of action are possible:
- If routing_nitpicky is set to
False
, then the message is given the green light. Our routing policy doesn’t have anything specific to say about messages of this topic and so who are we to deny it passage, right?- If routing_nitpicky is set to
True
, then we deny the message and mark it as invalid.
Typically, you’ll deploy fedmsg with nitpicky mode turned off. You can build your policy over time as you determine what services will be sending what messages. Once deployment of fedmsg reaches a certain level of stability, you can turn nitpicky mode on for enhanced security, but by doing so you may break certain message paths that you’ve forgotten to include in your routing policy.
Configuration¶
By convention, configuration values for fedmsg.crypto
are kept in
/etc/fedmsg.d/ssl.py
, although technically they can be kept in any
config dict
in /etc/fedmsg.d
(or in any of the config locations checked
by fedmsg.config
).
The cryptography routines expect the following values to be defined:
For general information on configuration, see fedmsg.config
.
Module Contents¶
fedmsg.crypto
encapsulates standalone functions for:
- Message signing.
- Signature validation.
- Stripping crypto information for view.
See fedmsg.crypto.x509
and fedmsg.crypto.gpg
for
implementation details.
-
fedmsg.crypto.
init
(**config)[source]¶ Initialize the crypto backend.
The backend can be one of two plugins:
- ‘x509’ - Uses x509 certificates.
- ‘gpg’ - Uses GnuPG keys.
-
fedmsg.crypto.
sign
(message, **config)[source]¶ Insert two new fields into the message dict and return it.
Those fields are:
- ‘signature’ - the computed message digest of the JSON repr.
- ‘certificate’ - the base64 certificate or gpg key of the signator.
-
fedmsg.crypto.
strip_credentials
(message)[source]¶ Strip credentials from a message dict.
A new dict is returned without either signature or certificate keys. This method can be called safely; the original dict is not modified.
This function is applicable using either using the x509 or gpg backends.
-
fedmsg.crypto.
validate
(message, **config)[source]¶ Return true or false if the message is signed appropriately.
-
fedmsg.crypto.
validate_signed_by
(message, signer, **config)[source]¶ Validate that a message was signed by a particular certificate.
This works much like
validate(...)
, but additionally accepts asigner
argument. It will reject a message for any of the regular circumstances, but will also reject it if its not signed by a cert with the argued name.
Message Encoding¶
fedmsg messages are encoded as JSON.
Use the functions fedmsg.encoding.loads()
, fedmsg.encoding.dumps()
,
and fedmsg.encoding.pretty_dumps()
to encode/decode.
When serializing objects (usually python dicts) with
fedmsg.encoding.dumps()
and fedmsg.encoding.pretty_dumps()
, the
following exceptions to normal JSON serialization are observed.
datetime.datetime
objects are correctly converted to seconds since the epoch.- For objects that are not JSON serializable, if they have a
.__json__()
method, that will be used instead.- SQLAlchemy models that do not specify a
.__json__()
method will be run throughfedmsg.encoding.sqla.to_json()
which recursively produces a dict of all attributes and relations of the object(!) Be careful using this, as you might expose information to the bus that you do not want to. See Cryptography and Message Signing for considerations.
-
fedmsg.encoding.
loads
(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)[source]¶ Deserialize
s
(astr
orunicode
instance containing a JSON document) to a Python object.If
s
is astr
instance and is encoded with an ASCII based encoding other than utf-8 (e.g. latin-1) then an appropriateencoding
name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed and should be decoded tounicode
first.object_hook
is an optional function that will be called with the result of any object literal decode (adict
). The return value ofobject_hook
will be used instead of thedict
. This feature can be used to implement custom decoders (e.g. JSON-RPC class hinting).object_pairs_hook
is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The return value ofobject_pairs_hook
will be used instead of thedict
. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, collections.OrderedDict will remember the order of insertion). Ifobject_hook
is also defined, theobject_pairs_hook
takes priority.parse_float
, if specified, will be called with the string of every JSON float to be decoded. By default this is equivalent to float(num_str). This can be used to use another datatype or parser for JSON floats (e.g. decimal.Decimal).parse_int
, if specified, will be called with the string of every JSON int to be decoded. By default this is equivalent to int(num_str). This can be used to use another datatype or parser for JSON integers (e.g. float).parse_constant
, if specified, will be called with one of the following strings: -Infinity, Infinity, NaN, null, true, false. This can be used to raise an exception if invalid JSON numbers are encountered.To use a custom
JSONDecoder
subclass, specify it with thecls
kwarg; otherwiseJSONDecoder
is used.
-
fedmsg.encoding.
dumps
(o)¶ Return a JSON string representation of a Python data structure.
>>> JSONEncoder().encode({"foo": ["bar", "baz"]}) '{"foo": ["bar", "baz"]}'
-
fedmsg.encoding.
pretty_dumps
(o)¶ Return a JSON string representation of a Python data structure.
>>> JSONEncoder().encode({"foo": ["bar", "baz"]}) '{"foo": ["bar", "baz"]}'
SQLAlchemy Encoding Utilities¶
fedmsg.encoding.sqla
houses utility functions for JSONifying
sqlalchemy models that do not define their own .__json__()
methods.
Use at your own risk. fedmsg.encoding.sqla.to_json()
will expose all
attributes and relations of your sqlalchemy object and may expose information
you not want it to. See Cryptography and Message Signing for considerations.
“Natural Language” Representation of Messages¶
fedmsg.meta
handles the conversion of fedmsg messages
(dict-like json objects) into internationalized human-readable
strings: strings like "nirik voted on a tag in tagger"
and
"lmacken commented on a bodhi update."
The intent is to use the module 1) in the fedmsg-irc
bot and 2) in the
gnome-shell desktop notification widget. The sky is the limit, though.
The primary entry point is fedmsg.meta.msg2repr()
which takes a dict and
returns the string representation. Portions of that string are in turn
produced by fedmsg.meta.msg2title()
, fedmsg.meta.msg2subtitle()
,
and fedmsg.meta.msg2link()
.
Message processing is handled by a list of MessageProcessors (instances of
fedmsg.meta.base.BaseProcessor
) which are discovered on a setuptools
entry-point. Messages for which no MessageProcessor exists are
handled gracefully.
The original deployment of fedmsg in Fedora Infrastructure uses metadata
providers/message processors from a plugin called
fedmsg_meta_fedora_infrastructure.
If you’d like to add your own processors for your own deployment, you’ll need
to extend fedmsg.meta.base.BaseProcessor
and override the appropriate
methods. If you package up your processor and expose it on the fedmsg.meta
entry-point, your new class will need to be added to the
fedmsg.meta.processors
list at runtime.
End users can have multiple plugin sets installed simultaneously.
-
fedmsg.meta.
conglomerate
(messages, subject=None, lexers=False, **config)[source]¶ Return a list of messages with some of them grouped into conglomerate messages. Conglomerate messages represent several other messages.
For example, you might pass this function a list of 40 messages. 38 of those are git.commit messages, 1 is a bodhi.update message, and 1 is a badge.award message. This function could return a list of three messages, one representing the 38 git commit messages, one representing the bodhi.update message, and one representing the badge.award message.
The
subject
argument is optional and will return “subjective” representations if possible (see msg2subjective(…)).Functionality is provided by fedmsg.meta plugins on a “best effort” basis.
-
fedmsg.meta.
graceful
(cls)[source]¶ A decorator to protect against message structure changes.
Many of our processors expect messages to be in a certain format. If the format changes, they may start to fail and raise exceptions. This decorator is in place to catch and log those exceptions and to gracefully return default values.
-
fedmsg.meta.
legacy_condition
(cls)¶ A decorator to protect against message structure changes.
Many of our processors expect messages to be in a certain format. If the format changes, they may start to fail and raise exceptions. This decorator is in place to catch and log those exceptions and to gracefully return default values.
-
fedmsg.meta.
make_processors
(**config)[source]¶ Initialize all of the text processors.
You’ll need to call this once before using any of the other functions in this module.
>>> import fedmsg.config >>> import fedmsg.meta >>> config = fedmsg.config.load_config([], None) >>> fedmsg.meta.make_processors(**config) >>> text = fedmsg.meta.msg2repr(some_message_dict, **config)
-
fedmsg.meta.
msg2agent
(msg, processor=None, **config)[source]¶ Return the single username who is the “agent” for an event.
An “agent” is the one responsible for the event taking place, for example, if one person gives karma to another, then both usernames are returned by msg2usernames, but only the one who gave the karma is returned by msg2agent.
If the processor registered to handle the message does not provide an agent method, then the first user returned by msg2usernames is returned (whether that is correct or not). Here we assume that if a processor implements agent, then it knows what it is doing and we should trust that. But if it does not implement it, we’ll try our best guess.
If there are no users returned by msg2usernames, then None is returned.
-
fedmsg.meta.
msg2emails
(msg, **config)[source]¶ Return a dict mapping of usernames to email addresses.
-
fedmsg.meta.
msg2lexer
(msg, processor=None, **config)[source]¶ Return a Pygments lexer able to parse the long_form of this message.
-
fedmsg.meta.
msg2long_form
(msg, **config)[source]¶ Return a ‘long form’ text representation of a message.
For most message, this will just default to the terse subtitle, but for some messages a long paragraph-structured block of text may be returned.
-
fedmsg.meta.
msg2objects
(msg, **config)[source]¶ Return a set of objects associated with a message.
“objects” here is the “objects” from english grammar.. meaning, the thing in the message upon which action is being done. The “subject” is the user and the “object” is the packages, or the wiki articles, or the blog posts.
Where possible, use slash-delimited names for objects (as in wiki URLs).
-
fedmsg.meta.
msg2packages
(msg, **config)[source]¶ Return a set of package names associated with a message.
-
fedmsg.meta.
msg2processor
(msg, **config)[source]¶ For a given message return the text processor that can handle it.
This will raise a
fedmsg.meta.ProcessorsNotInitialized
exception iffedmsg.meta.make_processors()
hasn’t been called yet.
-
fedmsg.meta.
msg2repr
(msg, **config)[source]¶ Return a human-readable or “natural language” representation of a dict-like fedmsg message. Think of this as the ‘top-most level’ function in this module.
-
fedmsg.meta.
msg2secondary_icon
(msg, **config)[source]¶ Return a secondary icon associated with a message.
-
fedmsg.meta.
msg2subjective
(msg, **config)[source]¶ Return a human-readable text representation of a dict-like fedmsg message from the subjective perspective of a user.
For example, if the subject viewing the message is “oddshocks” and the message would normally translate into “oddshocks commented on ticket #174”, it would instead translate into “you commented on ticket #174”.
-
fedmsg.meta.
msg2subtitle
(msg, **config)[source]¶ Return a ‘subtitle’ or secondary text associated with a message.
-
fedmsg.meta.
msg2title
(msg, **config)[source]¶ Return a ‘title’ or primary text associated with a message.
-
fedmsg.meta.
msg2usernames
(msg, **config)[source]¶ Return a set of FAS usernames associated with a message.
-
fedmsg.meta.
processors
= ProcessorsNotInitialized('You must first call fedmsg.meta.make_processors(**config)',)¶
-
class
fedmsg.meta.base.
BaseConglomerator
(processor, internationalization_callable, **conf)[source]¶ Bases:
object
Base Conglomerator. This abstract base class must be extended.
fedmsg.meta “conglomerators” are similar to but different from the fedmsg.meta “processors”. Where processors take a single message are return metadata about them (subtitle, a list of usernames, etc..), conglomerators take multiple messages and return a reduced subset of “conglomerate” messages. Think: there are 100 messages where pbrobinson built 100 different packages in koji – we can just represent those in a UI somewhere as a single message “pbrobinson built 100 different packages (click for details)”.
This BaseConglomerator is meant to be extended many times over to provide plugins that know how to conglomerate different combinations of messages.
-
conglomerate
(messages, subject=None, lexers=False, **conf)[source]¶ Top-level API entry point. Given a list of messages, transform it into a list of conglomerates where possible.
-
static
list_to_series
(items, N=3, oxford_comma=True)[source]¶ Convert a list of things into a comma-separated string.
>>> list_to_series(['a', 'b', 'c', 'd']) 'a, b, and 2 others' >>> list_to_series(['a', 'b', 'c', 'd'], N=4, oxford_comma=False) 'a, b, c and d'
-
merge
(constituents, subject, **config)[source]¶ Given N presumably matching messages, return one merged message
-
classmethod
produce_template
(constituents, subject, lexers=False, **config)[source]¶ Helper function used by merge. Produces the beginnings of a merged conglomerate message that needs to be later filled out by a subclass.
-
-
class
fedmsg.meta.base.
BaseProcessor
(internationalization_callable, **config)[source]¶ Bases:
object
Base Processor. Without being extended, this doesn’t actually handle any messages.
Processors require that an
internationalization_callable
be passed to them at instantiation. Internationalization is often done at import time, but we handle it at runtime so that a single process may translate fedmsg messages into multiple languages. Think: an IRC bot that runs #fedora-fedmsg, #fedora-fedmsg-es, #fedora-fedmsg-it. Or: a twitter bot that posts to multiple language-specific accounts.That feature is currently unused, but fedmsg.meta supports future internationalization (there may be bugs to work out).
-
agent
= NotImplemented¶
-
conglomerate
(messages, **config)[source]¶ Given N messages, return another list that has some of them grouped together into a common ‘item’.
A conglomeration of messages should be of the following form:
{ 'subtitle': 'relrod pushed commits to ghc and 487 other packages', 'link': None, # This could be something. 'icon': 'https://that-git-logo', 'secondary_icon': 'https://that-relrod-avatar', 'start_time': some_timestamp, 'end_time': some_other_timestamp, 'human_time': '5 minutes ago', 'usernames': ['relrod'], 'packages': ['ghc', 'nethack', ... ], 'topics': ['org.fedoraproject.prod.git.receive'], 'categories': ['git'], 'msg_ids': { '2014-abcde': { 'subtitle': 'relrod pushed some commits to ghc', 'title': 'git.receive', 'link': 'http://...', 'icon': 'http://...', }, '2014-bcdef': { 'subtitle': 'relrod pushed some commits to nethack', 'title': 'git.receive', 'link': 'http://...', 'icon': 'http://...', }, }, }
The telltale sign that an entry in a list of messages represents a conglomerate message is the presence of the plural
msg_ids
field. In contrast, ungrouped singular messages should bear a singularmsg_id
field.
-
conglomerators
= None¶
-
handle_msg
(msg, **config)[source]¶ If we can handle the given message, return the remainder of the topic.
Returns None if we can’t handle the message.
-
lexer
(msg, **config)[source]¶ Return a pygments lexer that can be applied to the long_form.
Returns None if no lexer is associated.
-
topic_prefix_re
= None¶
-
Replay¶
-
fedmsg.replay.
check_for_replay
(name, names_to_seq_id, msg, config, context=None)[source]¶ Check to see if messages need to be replayed.
Parameters: - name (str) – The consumer’s name.
- names_to_seq_id (dict) – A dictionary that maps names to the last seen sequence ID.
- msg (dict) – The latest message that has arrived.
- config (dict) – A configuration dictionary. This dictionary should contain, at a
minimum, two keys. The first key, ‘replay_endpoints’, should be a dictionary
that maps
name
to a ZeroMQ socket. The second key, ‘io_threads’, is an integer used to initialize the ZeroMQ context. - context (zmq.Context) – The ZeroMQ context to use. If a context is not provided, one will be created.
Returns: A list of message dictionaries.
Return type:
-
fedmsg.replay.
get_replay
(name, query, config, context=None)[source]¶ Query the replay endpoint for missed messages.
Parameters: - name (str) – The replay endpoint name.
- query (dict) –
A dictionary used to query the replay endpoint for messages. Queries are dictionaries with the following any of the following keys:
- ’seq_ids’: A
list
ofint
, matching the seq_id attributes of the messages. It should return at most as many messages as the length of the list, assuming no duplicate. - ’seq_id’: A single
int
matching the seq_id attribute of the message. Should return a single message. It is intended as a shorthand for singletonseq_ids
queries. - ’seq_id_range’: A two-tuple of
int
defining a range of seq_id to check. - ’msg_ids’: A
list
of UUIDs matching the msg_id attribute of the messages. - ’msg_id’: A single UUID for the msg_id attribute.
- ’time’: A tuple of two timestamps. It will return all messages emitted in between.
- ’seq_ids’: A
- config (dict) – A configuration dictionary. This dictionary should contain, at a
minimum, two keys. The first key, ‘replay_endpoints’, should be a dictionary
that maps
name
to a ZeroMQ socket. The second key, ‘io_threads’, is an integer used to initialize the ZeroMQ context. - context (zmq.Context) – The ZeroMQ context to use. If a context is not provided, one will be created.
Returns: A generator that yields message dictionaries.
Return type: generator
The fedmsg Protocol¶
fedmsg uses ZeroMQ Publish-Subscribe (PUBSUB) sockets for the messages sent by
fedmsg.publish()
and the messages received by fedmsg.tail_messages()
or by way of the Moksha Hub-Consumer approach.
Warning
The message format described below is not part of the public API at this time.
The published ZeroMQ message consists of a multi-part message of exactly two frames, formatted on the wire as follows:
- Frame 0: The message topic against which subscribers will perform a binary comparison.
- Frame 1: The JSON-serialized, UTF-8 encoded message.