Tag Archives: large language models

Dealing with AI

Dealing with AI

The UK Government publication of its AI Opportunities Action Plan (Clifford & Department for Science Innovation and Technology, 2025) sets out an agenda primarily focussed on the growth opportunities enabled by AI.

Many of the recommendations require translation from the problematic context to implementations using specific AI technologies, such as generative pre-trained transformers (GPT), and are therefore all examples of where problematisation is inevitable. As Callon (1980) observes 

Problematisation culminates in configurations characterised by their relative singularity. There is not one single way of defining problems, identifying and organising what is certain, repressing what cannot be analysed.”

How then are these decisions to be enacted, the processes of deciding, in the operationalisation of this action plan? This question is difficult to answer as the action plan is just a set of recommendations, a “roadmap for government to capture the opportunities of AI to enhance growth and productivity and create tangible benefits for UK citizens.” 

Roadmaps, like actions and solutions, present static nominalisations of what should be a dynamic process. The actual intervening in problematic areas such as health care, education and provision of services etc. will emerge from the normal processes of deliberating over any technology, not just AI. Restating Callon’s perspective, there will be an “abundance of problematisations” resulting from these deliberating processes leading to options where specific choices will need to be made. It is in the choosing between options, in the different ways in which interventions with AI technologies can be problematised, that the inevitable conflicts between different stakeholder needs, differing policy objectives, will be made clear. It is here that problem structuring as a deliberative process needs to operate. 

The usefulness of Callon’s work on problematisation is that it draws attention to the choices faced by an operational researcher when investigating a problem, and that there is not a single right answer to what is problematised. Whilst this supports the claims of Churchman, Ackoff and Checkland that these choices exist and that operational researchers are making conscious decisions about what to work on (and therefore what to ignore), it does not provide an answer to the problem of OR becoming practitioner-free, i.e. the problem of dealing with situational logics. The answer is found in the emergence of problem structuring methods (PSMs).” (Yearworth, 2025, pp. 13-14)

I was writing for Operational Researchers in my book, but the point is valid for any analyst or decision maker and their profession. Choices exist and conscious decisions need to be made about what to work on and what to ignore and these choices need to be made visible1 and debatable through formal processes – such as through the use of PSMs.

The situational logic that sits at the heart of the AI opportunities action plan is that choosing AI leads to (economic) growth. However, if we abrogate on our moral responsibility to make ethical choices and fall back on the simple rule-following of situational logics then we may as well hand the deliberation and implementation of an AI action plan to an AI itself2 and wash our hands of the consequences. 

I recognise that the AI opportunities action plan makes specific reference to “[g]lobal leadership on AI safety and governance via the AI Safety Institute, and a proportionate, flexible regulatory approach” and reflects the fact that choices need to be made by overloading the use of “proportionate.” Deliberating and deciding over a flexible regulatory approach will require hard work, these will be (and should be) difficult choices. Given the scale of the challenges and opportunities of AI, apportioning sole agency for this deliberating and deciding to the AI Safety Institute (AISI) just narrows the location and scope of debate around problematisations to, in effect, informing decisions about the boundaries of regulation that are broadly pro-innovation. Deflecting focus away from this concentration of decision making by talking about assurance tools in an AI assurance ecosystem just sounds like marketing i.e., our attention on the situational logic in operation here is being misdirected by the AISI.

For almost all the recommendations in the action plan, problematising should be a diffuse activity across a very broad range of actors, problem contexts, stakeholders, and technologies – putting choice into the hands of people best able to decide for themselves the scope of adoption of AI. By all means give organisations, and individuals, the processes that would enable them to make informed decisions, but these are not imposed ‘flexible regulation’ and ‘assurance tools’ that ultimately disempower.      

  • Callon, M. (1980). Struggles and Negotiations to Define What is Problematic and What is Not. In K. D. Knorr, R. Krohn, & R. Whitley (Eds.), The Social Process of Scientific Investigation (pp. 197-219). Springer Netherlands: Dordrecht. https://doi.org/10.1007/978-94-009-9109-5_8

  1. In effect the models/maps of the structured problem. ↩︎
  2. Checkland introduced the idea of the trap of situational logics in OR practice. However it was Rosenhead, writing in Rational Analysis for a Problematic World, who used the analogy of the sausage machine, which is more than apt here. ↩︎

Augmented Qualitative Analysis (AQA) and Large Language Models (LLMs)

Augmented Qualitative Analysis (AQA) and Large Language Models (LLMs)

Conducting experiments in the automated labelling of topics generated using the Augmented Qualitative Analysis (AQA) process outlined in an earlier post has resulted in some observations that have some bearing on the use of Large Language Models (LLMs) in Soft OR/PSM practice.

The starting point for AQA was the partial automation of some of the elements of qualitative analysis, which has resulted in the use of probabilistic topic models to code text in a corpus at the paragraph level and to produce maps of the interrelationship of concepts (qua topics). These maps of interrelationships can only be put forward for consideration as potentially causal links after the topics have been labelled. A map of topic X being linked to topic Y n times is only of statistical interest. We need meaning to be attached to the topics – ideally a process of consideration by a group that ‘owns’ the raw data – before we can produce a putative causal loop diagram (CLD).

To ground this technique in traditions of qualitative analysis would require the labelling of topics to proceed through an inductive process of inspecting the term lists and inspecting the text coded by the topics to build-up an understanding of what the topic means to the stakeholders (e.g., see Croidieu and Kim (2017)). This is a back-and-forth process that continues until all the topics have been labelled. The fact that the map can be updated with these topic labels in parallel provides an additional perspective on the understanding of the meaning of the topics. 

With the advent of LLMs it is possible to feed the term lists and the example text coded by a topic – and even the map of interrelationships – into a tool like ChatGPT with the purpose of generating topic labels. However, experiments in doing this have produced disappointing results, despite extensive efforts in refining prompts. From the perspective of a qualitative researcher, the coding seems to be too much in the text, too in-vivo. Despite attempts to get the LLM to draw on the breadth of its training data there seemed little evidence of the sort of theorising from the data that is a key feature of qualitative analysis (Hannigan et al., 2019).

This is clearly a new area and other researchers have conducted experiments on precisely this point of prompt engineering e.g., see Barua, Widmer and Hitzler (2024). However, there is still the sense that a LLM is operating as nothing more than a ‘stochastic parrot’ (Bender et al., 2021). Further, coupling the outputs from a probabilistic topic model to a LLM are unlikely to generate the sort of management insight that is discussed by Hannigan et al. (2019); although the putative causal maps are likely to make sense to the participants in a group, and are statistically justified. Ultimately, the use of LLMs in a process of problem structuring is only ever going to be limited. Problematising is a human activity, it requires a felt-sense of a situation being problematic for an intent to intervene to emerge. Asking a LLM to feel something is a wayward expectation.

The recommendation here, for any group working with a large and potentially growing corpus of documents and in need of a technique that supports rapid problematisation, is to work with two Group Support Systems (GSS). The first presents an interactive means of exploring the probabilistic topic model (e.g., using pyLDAvis the topic model for the 2012 Olympics data set discussed in a previous post can be explored here) combined with a means of investigating the text as coded by the topic model i.e., selecting text that is coded by a specific topic and inductively generating a topic label that has meaning to the group. In effect, replicating some of the features of a Computer Aided Qualitative Data Analysis Software (CAQDAS). The fully labelled model can then be taken into a strategy-making workshop supported by the second GSS, in this case Strategyfinder

The prospects of these two GSS merging into a single Problem Structuring Platform is the subject of my upcoming talk at OR66, see my previous post on AQA.

Barua, A., Widmer, C., & Hitzler, P. (2024). Concept Induction using LLMs: a user experiment for assessment. https://doi.org/10.48550/arXiv.2404.11875

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? FAccT 2021 – Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, https://doi.org/10.1145/3442188.3445922

Croidieu, G., & Kim, P. H. (2017). Labor of Love: Amateurs and Lay-expertise Legitimation in the Early U.S. Radio Field. Administrative Science Quarterly, 63(1), 1-42. https://doi.org/10.1177/0001839216686531

Hannigan, T. R., Haan, R. F. J., Vakili, K., Tchalian, H., Glaser, V. L., Wang, M. S., Kaplan, S., & Jennings, P. D. (2019). Topic modeling in management research: Rendering new theory from textual data. Academy of Management Annals, 13(2), 586-632. https://doi.org/10.5465/annals.2017.0099