Culture & Technology


Software Package Complexity - Modular Structure vs. Modular Design
Presenter: Christopher Blöcker
Abstract
Function calls extracted from static call graphs can be grouped in different ways, for example by using community detection and according to how developers have collected functions, methods, and classes in the source code files. We use the map equation to quantify the complexity of those two different ways of partitioning the static call graphs of software packages. With flow divergence, we measure the excess complexity of the structure in source code files relative to their communities, which can help hint at the need or opportunity of refactoring the codebase.
A day on Twitter is a fractal in socio-cultural space-time
Presenter: Haiko Lietz
Abstract
The day-on-Twitter dataset covers a complete day of communication. The weighted social network of reciprocated mentions contains 7.2M nodes and 9.2M edges. It is a small-world network with a characteristic path length as that of a corresponding random network. Another paradigmatic network topology is that of self-similarity. A network is self-similar if the minimum number of boxes needed to cover the network scales as a power law with the size of the box. The self-similar property is theoretically at odds with the small-world property because self-similar networks have large characteristic path lengths. The complete social network of reciprocated mentions is not self-similar. However, we know that small-world networks can have an underlying self-similar structure when weak ties are removed. We develop this line of research by applying percolation theory. Using minimum tie strength s_min as a control parameter, we identify a topological phase transition at s_min=4. This transition is marked by a sharp decrease of the percolation probability (fraction of nodes in the largest component) and a maximization of the susceptibility (average size of all components but the largest). Only at the critical value, the distribution of component sizes is a power law and the network is self-similar. To shed light on the mentioning dynamics, we define a connectivity avalanche as an event in which a user continuously mentions, or is continually mentioned by, another user. The avalanche size scales as a power law with its frequency (inverse duration). The exponent is between -1.30 and -1.55. This means that the mentioning practice resembles fractal Brownian motion with rather long-term memory. In the talk, we will discuss further results about the cultural dynamics, how they contribute to our conclusion that a day on Twitter is a fractal in socio-cultural space-time, and how we think structure and dynamics go hand in hand to form what we can then measure.
Learning in a Software Task Space
Presenter: Xiangnan Feng
Abstract
In the rapidly evolving global economy, the computer science and software industry represents one of the fastest-developing sectors. Today, the software industry is a major sector, with over half of the ten most valuable companies in the world operating on the internet and in software markets. However, we know little about the exact nature of jobs in the software sector itself, including their task composition and the evolution thereof. This restricts our ability to anticipate future trends and help workers, firms and countries adapt to the changing demands of the software labor market. To meet this challenge, we focus on the computer science and software industry and extract tasks from data from Stack Overflow, an online question-answer platform for computer programming topics. Utilizing network science tools, we construct a task space for individuals involved in software development and other programming activities. This framework allows us to define tasks to facilitate an in-depth analysis of work in the software industry. Our research provides a powerful and insightful framework to study the software industry landscape and the labour market.
Technological portfolio strategies: coherence, coreness, and diversification patterns
Presenter: Tomomi Kito
Abstract
Globalization and rapid technological advancements have heightened competition among firms. Both startups and established companies face obsolescence if they cling to specific technologies or business areas. To survive and drive innovation, many firms adopt technological portfolio diversification strategies to expand their technological domains.A prominent approach is related diversification, where firms extend their portfolios into closely related technologi- cal areas. The relationship between related diversification and firm performance has been a central topic in manage- ment research. Recent studies utilizing the concept of “relatedness” have quantified proximities between technologies and constructed technology relatedness networks. Prior research [1] indicates that firms with more consistent portfo- lios—comprising closely related technologies—tend to perform better, though this trend is predominantly driven by a small number of large firms.This study introduces a novel framework for quantitatively evaluating technological portfolios using three metrics: degree of related diversification (for which we propose a new measure), portfolio coherence [1], and technology coreness (a modified measure [2]). Using patent data, we analyzed the characteristics and temporal evolution of portfolios across firms of varying sizes and industries. Our findings reveal that most firms’ core technological domains do not align with the central nodes of cohesive diversification (see Fig. 1). Smaller firms often pursue niche strategies, combining unrelated or sparsely populated domains. Temporal analyses uncovered distinct diversification patterns influenced by portfolio size and the uniqueness of technological domains.This framework offers actionable insights for firms seeking to refine their portfolio strategies, helping them effectively expand or concentrate their technological focus based on core strengths, resource constraints, competitive dynamics, and market conditions.
Digitising History: Network Perspectives of a Medieval Court
Presenter: Marcella Tambuscio
Abstract
Networks offer a powerful lens for historians, allowing them to analyze the complex interplay of individuals, groups, and institutions over time. Our research, part of the "Managing Maximilian" project, uses the network paradigm to explore the multifaceted interactions around Maximilian I (1459–1519), the Habsburg emperor renowned for his strategic and propagandistic skills. In our subproject, Digitising Maximilian, we are building a digital platform that integrates and analyses data from different contexts. The data model, ibased the “factoid approach”, ensures interoperability while accommodating diverse datasets, providing a large knowledge graph in which several types of Entities are connected through Statements. While building the platform, we ran some preliminary temporal network analysis on short texts from the dataset Regesta Imperii XIII. Standard network methods revealed diachronic shifts in relationships, demonstrating the potential of network techniques for prosopographical studies. Building on these insights, we now incorporate more granular datasets from the ManMax project, introducing multiplex networks that capture distinct types of interactions, such as correspondences, payments, or events. This approach mitigates the overrepresentation issues of co-occurrence models by differentiating layers and their interrelations. Early results are promising: the network, though still small, shows emergent structures like a giant component and recurrent patterns of activities, highlighting the intricate mechanisms of influence and collaboration at the emperor’s court. As the database grows, we anticipate identifying large connected components across multiple layers and future plans include inferring hidden elements in the network—such as unrecorded female actors or emergent communities—using for instance graph embeddings. These tools will help uncover latent connections, offering unprecedented insights into the social fabric of Maximilian's reign.
The Economic Complexity of the Roman Empire
Presenter: Michele Coscia
Abstract
Economic complexity is a key concept allowing us to forecast the growth of a country. Given its relevance for economic development, much attention has been devoted into studying how countries acquire economic complexity. In this work, we take a historic perspective, hypothesizing that complexity from many centuries in the past leaves detectable traces today.We leverage a dataset of 500k Latin inscriptions created during the span of the Roman Empire. From these inscriptions, we obtain geolocated data about the occupations in the Roman Empire. We can create a bipartite province-occupation network, after dealing with biases of archaeological data -- gender disparities, occupation status, diversity in research intensity.The data shows the nestedness pattern typical of modern economies -- with complex places including all occupations and basic occupations expressed in every place. We create an Occupation Space, linking occupations if they co-appear significantly in the same places (Fig 1a).We can estimate the economic complexity of the Roman Empire (Fig 1b). Since the empire spanned an area including 36 modern countries, we can correlate present and past complexity. When we do so, we find that ancient complexity is a marginally significant predictor of modern complexity, hinting that the structures of complexity -- whether they are due to geographical location or other exogenous factors -- survive even after millennia of history.
Harnessing Higher-Order Networks to Trace the Development of Doctrine
Presenter: Dirk Hartung
Abstract
Modeling collections of historical documents as temporal citation networks allows us to investigate a wide range of dynamic phenomena, such as the diffusion of ideas or the formation of concepts. However, if the documents in question cover multiple topics and exhibit rich internal structure, the traditional modeling of documents as nodes and citations as directed edges discards too much relevant context to obtain nuanced insights. Motivated by the desire to understand how courts in different jurisdictions use existing case law to construct and support the legal reasoning communicated in their decisions, we propose to model a collection of documents as a directed temporal hypergraph, where each document is represented both as a node and as a set of directed hyperedges. Here, each hyperedge captures which documents are cited by a given document in the same context — e.g., in the same paragraph or to support the same argument —, and all nodes and hyperedges are associated with timestamps. This allows us to jointly leverage two order structures in our analyses: the partial order of hyperedges by inclusion and the total order of nodes and hyperedges by their timestamps. To this end, we introduce intuitive methods based on doubly temporal posets – i.e., partially ordered sets with temporal information on the sets as well as on the elements of the ground set –, with active posets recording how a document initially connects to an existing corpus, and passive posets recording how the document is subsequently integrated into the corpus. Our definitions allow us, inter alia, to compute refined versions of bibliographic coupling and co-citation that can be generalized to sets of nodes and sets of hyperedges, and to assess the temporal similarity of nodes or hyperedges based on the temporal structure of their active and passive posets. Our models and methods are readily applicable to all citation networks for which information on citation contexts is both relevant and available.
A Sheaf Theoretic Approach to Near Space Network Routing
Presenter: Tobias Timofeyev
Abstract
Delay Tolerant Networking (DTN), as a protocol suite for networked communications in space, is maturing into a viable enabling technology for the Solar System Internet (SSI). The goal for the SSI is to bring the reliability and efficiency gains of networking in terrestrial communications to all space communications. The focus of the SSI is now shifting towards the structural considerations in the use of DTN, as they pertain to it scaling to larger networks. The implications in the extension of these architectures to the scale of the solar system, and the best way to model them, are not well understood. This work attempts to address this gap in the mathematical foundations of this network and routing problem. We explore hypergraphs as faithful representations of network structure through sheaf theoretic routing models. In prior work the mathematical data structures known as sheaves proved useful for modeling general routing on graphs. Sheaves have seen much interest in application for their ability to formalize global agreement in topological spaces, given local information and verification. This is vital in our setting as current communication methods require global knowledge of the time-varying network, which is an infeasible constraint as the networks grow in size. The directed path sheaf, when applied to a directed graph, provides a description of locally verified paths, or routing sequences, between chosen vertices. Sheaves elucidate the dependencies of local to global information consistency within the structural confines of the space they are modeling in. Along these lines, we present what can be learned in the generalization of sheaf-theoretic routing models to hypergraphs, and further, the implications of generalizing routing from graphs to hypergraphs. In particular, the structures revealed in the global sections are more general than paths, introducing new questions about what routing should look like on these structures.