Capgemini Software Engineering

Use Cases for AI

2024-09-06T00:00:00+01:00

Use Cases for Generative AI

For the past year, almost every week the Software Engineering world is asked about Artificial Intelligence (AI). “What should Capgemini’s AI offerings be?”, “Aren’t you worried about the role of Capgemini software engineers now that AI can write code?” Specifically in these cases, the questions are around Generative AI. To put that in context, here is a recap of some terms used to describe different implementations of AI.

Machine Learning

A nice dictionary definition of machine learning from Oxford Languages is: “the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyse and draw inferences from patterns in data.” The difference between a basic decision tree and machine learning is that with machine learning, a person didn’t sit down and define a set of if-statements for the machine to follow to make decisions (as used, for instance, in the NHS Map of Medicine from 2007). Instead, they provided a set of inputs and expected outputs and let the machine’s algorithms figure out how to get from input to output. You could describe it as a bottom-up approach to problem solving rather than a top-down approach. This evokes the differences between Newton and Leibnitz in their attempts to describe the physical world in the 17th century. Newton’s laws are built more like a decision tree, describing the results of forces on objects, whereas Leibnitz’s “monads” are more like the algorithms in machine learning where small, simple rules build up together to create the complex forces we see in action. They are both very useful tools in describing and learning about the physical world but may result in different predictions of varying accuracy. Machine learning algorithms are loosely based on the neurons in the brain. Each “artificial neuron” is a function that receives one or more inputs, applies weights to these inputs, and sums them to produce an output. Chained together into a “Neural Network” these small functions can complete amazing tasks such as image recognition, text generation and predictive analytics. Machine learning is incredibly useful and can benefit a wide range of industries from healthcare to travel to finance.

Generative AI

Dictionaries are not so hot on defining generative AI. Perhaps the best example is from Merriam Webster: “artificial intelligence that is capable of generating new content (such as images or text) in response to a submitted prompt (such as a query) by learning from a large reference database of examples”. The important difference here is the generation of new content. There are still machine-learning algorithms at play, but the ask is that they produce something new. The output of a generative AI query isn’t pulled from a list, it isn’t copied from an article. It is created by the neural network. You MAY get a text answer that is the same as an existing article, but it will have been probabilistically generated using an algorithm that, in simple terms, says “what word is most likely to come next in this phrase”, for example. Asking the same question twice will not produce the same output. With machine learning, the output you get is usually a probability – for example, a handwriting recognition system would give a probability percentage that a given input image contains certain words. As such, it isn’t ever completely wrong, because it doesn’t give a definitive answer. Generative AI will look at the various probable outputs and choose one by applying a separate decision algorithm. In this way, it is possible to get answers that are just plain wrong.

General AI

Wikipedia says General AI “is a type of AI that matches or surpasses human capabilities across a wide range of cognitive tasks.” To put it simply, General AI does not exist. We don’t currently accurately define human capabilities, so trying to decide whether a machine can exceed them is impossible!

Examining the Use Cases

As mentioned above, Machine Learning solutions are proving vastly useful in automating complex tasks that must make deductions based on huge quantities of data. A notable example is evaluating patient CT scan data in search of cancerous growths. The algorithm does not replace the oncologist, but assists to alleviate an otherwise “repetitive, tedious and burdensome” task. The outputs are not treated as gospel truth, but help the oncologist choose where to focus their time and efforts.

There has understandably been much buzz and excitement about the amazing capabilities of Generative AI this past year, with the broader uptake of ChatGPT and Dall-e, for example. Now that the dust has settled, a closer evaluation of Generative AI use cases is required.

Remember this diagram for blockchain use cases? Perhaps it’s time for something similar for Generative AI use cases?

The key thing to remember about Generative AI is that the answers it generates are not necessarily correct, and are not even intended to be correct. When architecting a solution that involves Generative AI, the pertinent questions are:

Does it matter if this generated answer is wrong?
Do I require the same answer to the same question if I ask it twice?

If the answer to either of these questions is YES, then you cannot use a standard Generative AI solution without some significant tuning which in itself removes the “generative-ness” of your solution.

There are still plenty of scenarios where the answer to the above questions is NO. With a focus on Microservice Development, we will look at the use cases that have been put forward for enhancing the software development lifecycle:

User Interface Design

When designing a human-computer interface to improve the user experience, there are best practices but there is no wrong answer. As such, it is a viable candidate for applying Generative AI - the role of an Agile Software Engineer in interface design is to iteratively put suggestions forward to the system user for their consideration. AI tools could accelerate prototypes, and assist the designer in breaking out of their habitual designs and trying something new. Already, design tools such as Figma and V0 by Vercel have started to put this into practice.

Automated Testing

A software test suite should allow all routes through the system to be automatically checked when a change is added, to ensure that no previous logic has been broken by the new addition. Defining “all routes” isn’t as straightforward as you may think. There is a great testing joke:

A tester walks into a bar and orders a beer.
Then they order null beers.
Then they order 99999 beers.
Then they order ytugihoij^%&*( beers.
Then they order a bear.
A customer walks into a bar and asks where the toilets are. The bar explodes.

Turning up the “temperature” on Generative AI and asking it to generate some test cases, and getting some random test cases generated which a tester might not have considered, could be useful. A similar principle is applied to testing technical architectures in Barry O’Reilly’s Residuality Theory, whereby he suggests imagining random events to test the robustness of a software architecture. For example, if you architect for the scenario that giant lizards might crawl out of the River Thames and set fire to London, your application will consequently be able to survive events such as power blackouts or train strikes! Generative AI could be used to produce those off-the-wall architectural scenarios that help improve your overall application stability over time.

Writing Code

There has been a lot published around the ability of Generative AI to write code. With applications such as Github Copilot and Azure Copilot being very easy to integrate into a developer’s workflow (or to use standalone as a pseudo developer), this would appear to be a central use case for Generative AI, and some believe that Generative AI will replace the software engineer’s role imminently. What we have seen with Capgemini software projects does not align with this, however – it is not possible to provide a junior developer with a Copilot license and expect them to perform like a senior developer, let alone enabling businesses to create software themselves without a software engineer present at all. There are a few big and obvious reasons for this:

It only saves time spent writing code
Developers spend surprisingly small amounts of time actually writing code There are various studies showing percentages between 9% and 67% of time is spent “coding”, depending on the definition, the seniority of the developer, the workplace environment. Either way, there is a lot of time that a developer spends on tasks that Generative AI code assistants are not designed to help with. Github proudly announced huge productivity gains of 55% when using Copilot, but saving 55% of the 9% of time you actually spend coding isn’t as impressive - plus, most of the examples I have seen of using Copilot focus on reducing boilerplate or writing simple games, and these use-cases are not particularly relevant to the typical task list of a senior software engineer.
It takes value away from the development lifecycle
An often-quoted example of Generative AI writing code is the creation of unit tests. The best practice when developing code is to employ Test-Driven Development (TDD), an architectural approach which forces the developer to consider their code interfaces before they even write their method implementations. Relying on Generative AI to write unit tests implies they will be written after the code, hence not following TDD practices. A counter-argument is where a team inherits a codebase which does not have high unit test coverage. Is it safe to allow Generative AI to create the tests? Again, let’s go back to the initial ask - Does it matter if this generated answer is wrong? Yes it does matter. The only thing worse than no tests is incorrect tests! It is possible to generate unit tests for an existing code framework using rules rather than Generative AI. In fact, for many Generative AI use cases this is true – a rules-based or machine learning result from a query is more useful than a generated result. What we find with many Generative AI use cases is that the useful component is very often the natural language “chatbot” interface and the way it can translate a natural language query into a search term, rather than the generated result.
It’s most useful for the people who don’t want it
It is worth considering who will find a Generative AI code assistant most useful. Senior developers are less likely to use it to generate much code, as they have already found ways to avoid boilerplate, and the difficult work of fine-tuning or bug-fixing a code base is less likely to bring useful suggestions from Copilot. It is still incredibly useful though and can save many hours of trawling the Internet. Below is a real-world example of a conversation held with Azure Copilot. The software engineer has identified an issue with the code, and uses Copilot to find a way to code a specific fix. Interestingly many of the most useful applications of Generative AI coding appear to be helping humans to write regular expressions!

As you can see from the prompt engineering, considerable knowledge is needed to get the code that is required to fix a problem in a code base. Further knowledge is needed to check whether the required answer is relevant. And user beware - if you lose your chat history in Copilot (as will happen with Microsoft Copilot in “Data Protection” mode) of course you cannot get the same answer a second time by asking the question again, and this can result in many frustrating lost hours trying to find what you briefly saw via an internet search instead! So much for time saving!

When interviewed, most senior software engineers will talk about their problem-solving mindset. They enjoy problem solving, and will actively look for problems to solve. As such, a tool which does the “fun bit” of problem solving for them is not going to have a great take-up. Regular expressions aside, it is junior developers who will be most likely to take up Generative AI and use it as a crutch. And here we must go back to our initial check: Does it matter if the code they generate is wrong? In our software supply chain pipelines we usually have built-in checks to ensure compilation, look at code quality and mandate test coverage, so some “wrong” code will be weeded out by the pipelines, but there is still a huge risk that a junior developer is shipping code the impact of which they do not really understand. Currently, hackers have to go to great lengths to get unknown code into production code-bases, now it seems we are embracing a tool that will do it for them. Care must be taken here.

It’s only as good as the average of the code it learns from
The data set used by Generative AI to produce code is often all open source code bases. This is a huge dataset and fairly comprehensive – many, many programmers have contributed to open source projects. Therefore, we have literally created the world’s most average programmer. This is rarely the target that a business sets for its IT divisions… And indeed, reports are now proving that a code base maintained with Generative AI degenerates in quality over time.
It’s just another programming language to learn
There have always been layers of abstraction between humans and machines, be it 10 strings on an abacus to represent decimal numbers, typewriter keys operating little letter printers, or an IDE auto-completing your code bracket pairs. Even Java code compiling to JVM bytecode which is turned into operating-system-specific instructions by the JVM, or NoCode/LoCode drag-and-drop integrations. Most of the text-based layers of abstraction have been called “programming languages” and in that way, is Generative AI very different? It is still part of the chain between the human ideation of the software, and the executable software itself. If we all become expert prompt engineers and lose the ability to write Javascript code from scratch, is that any different from losing the ability to write C++ code because we write Javascript and get Google’s V8 engine to turn it into C++ for us? Software engineering is still the same skill, but the language used is now the language of prompt engineering. The only difference is in the fact that the Generative AI answer varies over time and hence the process of debugging could get pretty interesting .

In conclusion, there are fewer use cases than one might think for Generative AI, and whilst it will be a great tool for developers and will probably be as game-changing as IDEs were in the 1990s, it isn’t quite the end to software engineering that people are predicting - developers, don’t go hanging up your boots just yet.

Content Generation

One of the often-touted use cases of Generative AI, within and without the Software Development community, is its ability to generate content for blogs and articles. (Perhaps it is also able to read said content and “like” it?) As an example, we can compare the content of this article with the content generated when we ask Microsoft Copilot to write it, using this prompt:

“Write 2000 words on the use cases for Generative AI in the software development lifecycle. Include a definition of machine learning, generative AI and general AI in the first section. Use a formal but friendly tone. Write with the persona of a cautious software engineer.”

Copilot

Certainly! Let’s delve into the fascinating world of Generative AI and its impact on the software development lifecycle. As a cautious software engineer, I’ll guide you through the nuances while maintaining a friendly tone.

Before we explore the use cases, let’s clarify some terms:

Machine Learning (ML): ML is a subset of artificial intelligence (AI) that enables systems to learn from data and improve their performance over time. It involves algorithms that automatically learn patterns and make predictions based on examples.

Generative AI (GenAI): GenAI goes beyond traditional ML by creating new content rather than just predicting existing patterns. It generates novel outputs, such as text, images, or code, based on input data or prompts.

General AI (AGI): AGI refers to an AI system with human-like cognitive abilities, including reasoning, problem-solving, and creativity. While AGI remains an aspiration, > GenAI represents a step toward achieving it.

Code Generation and Assistance:

GenAI tools like ChatGPT and GitHub Copilot assist developers by suggesting code snippets, improving productivity, and reducing manual coding efforts.

However, it’s essential to recognize that GenAI isn’t a silver bullet. It complements human expertise rather than replacing it entirely.
Content Creation:

GenAI can create documentation, blog posts, and other textual content. It helps developers communicate effectively and maintain consistent documentation.
Test Script Generation:

Beyond user stories and test cases, GenAI can generate test scripts and scenarios.

It identifies edge cases, executes automated tests, and logs defects, enhancing software quality.
Unit Test Cases:

GenAI assists in writing unit test cases, ensuring robust code coverage.

Developers benefit from faster test case creation and improved code reliability.
Source Code Generation:

While GenAI isn’t writing entire applications (yet!), it can generate boilerplate code, saving time during initial setup.

Developers remain responsible for critical design decisions.

Quality Control:

Always review GenAI-generated content critically. It’s a tool, not a replacement for human judgment.

Verify that the code aligns with best practices, security guidelines, and project requirements.

Ethical Implications:

GenAI learns from existing data, which may contain biases. Be cautious about perpetuating bias in your code.

Regularly assess the ethical impact of GenAI-generated content.

Balancing Speed and Quality:

GenAI accelerates development, but don’t sacrifice quality for speed.

Maintain a balance—use GenAI where it enhances productivity without compromising reliability.

Generative AI holds immense promise for software development. By understanding its capabilities, limitations, and ethical considerations, cautious engineers can leverage it effectively. Remember, GenAI is a valuable ally, but it’s our collective wisdom that truly drives innovation in the software development lifecycle. As a cautious software engineer, I’ve emphasized practical insights while maintaining a friendly tone. If you have any further questions or need clarification, feel free to ask! 😊 Source: Conversation with Copilot, 19/06/2024
References given:

Generative AI and the software development lifecycle

Automating the Software Development Life Cycle with Generative AI

How Generative AI is Reshaping the Software Development Lifecycle: From …

IMPACT OF GENERATIVE AI ON THE SOFTWARE DEVELOPMENT LIFE CYCLE (SDLC)

Not bad, Copilot! However it’s only 500 words. No amount of re-prompting and scolding would generate more. Seems you still can’t beat humans for their sheer ability to vacillate!

Use Cases for AI was originally published by Capgemini at Capgemini Software Engineering on September 06, 2024.

AWS Simple Email Service Security

2024-07-12T00:00:00+01:00

In the evolving landscape of digital communication, email spoofing poses a significant threat to the integrity of online identities. As a hobby project, and to practice AWS and software design, I have been developing an application called MovieBooker. This application streamlines the process of booking movie tickets and managing movie programs for both moviegoers and staff.

Email Integration with AWS SES

As part of the infrastructure for the MovieBooker application, I implemented Amazon Simple Email Service (SES) to send emails when a user pays for a ticket. However, I encountered a challenge where the messages contained a banner with a “Be careful with this message” warning. This marked the beginning of my quest to unravel the secrets of email security.

Strengthening Email Security with SPF, DKIM, and DMARC

In order to get rid of the “Be careful with this message” warning. I had to set up Sender Policy Framework (SPF), Domain Keys Identified Mail (DKIM) and Domain Message Authentication Reporting and Conformance (DMARC).

SPF is an email authentication standard that helps protect senders and recipients from spam, spoofing, and phishing. DKIM is a standard email authentication method that adds a digital signature to outgoing messages. DMARC helps mail administrators prevent hackers and other attackers from spoofing their organization and domain. Spoofing is a type of attack in which the From address of an email message is forged.

Determined to make my application’s emails as trustworthy as possible, I delved into SPF, DKIM, and DMARC. These three tools promised an end-to-end authentication mechanism that would transform my emails into beacons of trust.

What You Will Learn

I aim to share the valuable insights I gained while implementing a secure end-to-end email authentication system. In this article, you will learn how to harness the power of three essential tools:

SPF (Sender Policy Framework)
DKIM (DomainKeys Identified Mail)
DMARC (Domain-based Message Authentication, Reporting, and Conformance)

These lessons were acquired through practical application in fortifying the email communication of my application. Join me as I guide you through the process of implementing these robust security measures.

Understanding the Threat Landscape

Insecure email communication poses various risks, including phishing attacks that trick recipients into divulging sensitive information, email spoofing for deceptive purposes, and business email compromise (BEC) threats involving unauthorized access to legitimate accounts. Additionally, insecure emails may damage the sender’s domain reputation, result in financial losses, and lead to regulatory compliance violations.

To mitigate these risks, it’s crucial to implement robust email security measures such as encryption, multi-factor authentication, and the adoption of authentication protocols like SPF, DKIM, and DMARC. Educating users on recognizing and avoiding phishing attempts further enhances overall email security.

Overview of SPF, DKIM, and DMARC

SPF - Sender Policy Framework:

SPF acts as a set of rules and regulations for email servers, authorizing them to send emails on behalf of a specific domain.
It enables the receiver to determine whether an incoming email is genuinely sent from an authorized server associated with the claimed domain.

DKIM - DomainKeys Identified Mail:

DKIM adds a digital signature to outgoing emails, creating a unique identifier that can be verified by the recipient.
The recipient checks the signature against the public key in the DNS records, ensuring the email is unaltered and genuinely originated from the claimed sender’s domain.

DMARC - Domain-based Message Authentication, Reporting, and Conformance:

DMARC acts as a protocol that instructs the email receiver on how to handle authenticated emails.
It provides options such as acceptance, marking as spam, quarantine, or rejection based on SPF and DKIM verification results.

Benefits of SPF, DKIM, and DMARC:

Mitigates the risk of email spoofing and phishing attacks: Ensures only authenticated emails are accepted, reducing the likelihood of malicious activities.
Enhances the reputation of your domain: Implementing industry-standard email authentication practices contributes to a positive domain reputation.
Provides a clear framework for email handling: Reduces the chances of legitimate emails being marked as spam, improving communication reliability.
Aids in achieving and maintaining regulatory compliance: Particularly beneficial in industries with stringent data protection regulations, helping organizations avoid legal consequences and financial penalties.
DMARC protects the reputation of the sender’s domain: Provides visibility into email authentication results, preventing unauthorized use of the domain in phishing attempts and strengthening overall brand trust.
Contributes to the mitigation of data leakage: Ensures that sensitive information transmitted via email is secure and has not been tampered with during transit.

Establishing a Secure Communication Channel

By implementing SPF, DKIM, and DMARC, we establish a formidable defense against unauthorized email activities, offering a secure and reliable communication channel that fosters trust and confidence among users.

What is AWS SES?

Amazon Simple Email Service (AWS SES) is a cloud-based email sending service designed to simplify the process of sending emails from applications and websites. Its role extends beyond just delivering emails; AWS SES serves as a comprehensive solution for ensuring secure and reliable email communication.

Key Features:

Comprehensive Email Solution: AWS SES goes beyond basic email delivery, providing a robust solution for secure communication.
Easy Integration with AWS Services: Designed for easy integration with other AWS services, making it straightforward to implement and manage SPF, DKIM, and DMARC alongside complementary AWS solutions.
Seamless Integration with Authentication Protocols: AWS SES seamlessly integrates with SPF, DKIM, and DMARC authentication protocols, providing a strong framework to verify sender identity and prevent email spoofing or phishing attempts.
Configurable Email Sending Settings: Allows users to configure and customize email sending settings, enabling the implementation of specific security measures tailored to the organization’s needs.

With AWS SES, you can ensure that your email communication is not only efficient but also adheres to industry-leading security practices.

Implementing DKIM with AWS SES

To enhance the security of your email communication, you can implement DKIM (DomainKeys Identified Mail) with Amazon Simple Email Service (AWS SES). Use this link to set up DKIM: Setting up Easy DKIM

Set Up Public Key in DNS: Once DKIM signatures are enabled, you need to set up the public key in your DNS.

Retrieve the public key from the AWS SES console.
Add a TXT record to your DNS with the DKIM public key.

Example DNS Record:

Name: _domainkey.yourdomain.com
Type: TXT
Value: "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBA..."

These steps will guide you through the process of setting up DKIM for your AWS SES configuration, adding an extra layer of authenticity to your outgoing emails.

Setting up SPF

Before we configure SPF, it’s essential to understand the key components of an SPF (Sender Policy Framework) record. SPF is a crucial authentication mechanism that helps prevent email spoofing and phishing attacks. Below is a detailed breakdown of the components in an SPF record.

Version (v):

The SPF version is specified using the “v” tag, followed by the version number. For example, “v=spf1” indicates SPF version 1.

Mechanisms:

Mechanisms are the core components of an SPF record that define the rules for email server authorization. Common mechanisms include:

a: Allows the domain’s A record (IPv4 address) to send mail.
mx: Allows the domain’s MX record (mail exchange) to send mail.
include:domain.com: Permits the specified domain to send mail on behalf of the current domain.
ip4:192.168.0.1: Authorizes the specific IPv4 address to send mail.
all: Specifies the default action for servers not covered by previous mechanisms. It can be set to “allow,” “deny,” or “neutral.”

Directives:

Directives modify or refine the behavior of the mechanisms. Some common directives include:

redirect: Redirects the SPF check to another domain’s SPF record.
exp: Explanations that provide additional information in case of SPF failures.
exists: Checks if a specified domain has an A record, allowing for conditional SPF processing.

Qualifiers:

Qualifiers define the result of a mechanism or directive. There are four possible qualifiers:

+ (Pass): The server is authorized, and the SPF check passes.
- (Fail): The server is not authorized, and the SPF check fails.
~ (SoftFail): The server is not authorized, but the SPF check may still pass (often used for testing or gradual SPF deployment).
? (Neutral): No explicit authorization or denial; the SPF check is inconclusive.

Now, armed with an understanding of these components, you can proceed to configure SPF for your domain, strengthening email authentication and security.

Example SPF Record

Below is an example SPF (Sender Policy Framework) record that demonstrates the use of different mechanisms and qualifiers:

v=spf1 include:_spf.example.com ip4:192.168.0.1 -all
Name: yourdomain.com
Type: TXT
Value: "v=spf1 include:_spf.example.com ip4:192.168.0.1 -all"

Create an SPF format "v={spf version}{mechanisms}{directive}{qualifier}all"

Follow this link for more information on DNS settings: DNS settings. I added this TXT entry in Route 53 for SPF support:

Configuring DMARC Policies

Now, let’s walk through the steps to configure DMARC (Domain-based Message Authentication, Reporting, and Conformance) policies for your domain in Amazon Web Services (AWS) using Route 53:

Log in to Amazon Web Services: Go to https://aws.amazon.com/ and log in to your AWS account.
Navigate to Route 53:
- Under “Services,” find and click on “Route 53” under “Network & Content Delivery.”
Access Hosted Zones:
- In the “DNS Management” section, select “Hosted Zones.”
Select Your Domain:
- From the list of hosted zones, find and click on the domain you want to configure DMARC for.
Create a Record Set:
- Click on the “Create Record Set” button.
Configure DMARC Record:
- In the “Name” field, type “_dmarc”.
- Under “Type,” select “TXT” from the drop-down menu.

Example DMARC Record:

Name: _dmarc.yourdomain.com
Type: TXT
Value: "v=DMARC1; p=quarantine; rua=mailto:dmarc@example.com; ruf=mailto:dmarc@example.com"

Generate a value for the DMARC record set on this site powerdmarc

Add the DMARC record to the DNS record in Route 53:

Conclusion

The implementation of Sender Policy Framework (SPF), DomainKeys Identified Mail (DKIM), and Domain-based Message Authentication, Reporting, and Conformance (DMARC) within the framework of Amazon Simple Email Service (SES) presents a robust and multifaceted approach to fortifying email security. By embracing these authentication protocols, organizations and individuals can significantly mitigate risks associated with phishing, email spoofing, and unauthorized access.

SPF as the Gatekeeper: SPF acts as a gatekeeper, specifying authorized email servers and reducing the likelihood of malicious actors impersonating trusted sources.
DKIM for Extra Security: DKIM adds an extra layer of security by providing a unique digital signature, ensuring that the integrity of the email remains intact during transmission.
DMARC as the Decision-Maker: DMARC serves as the wise decision-maker, guiding email receivers on how to handle authenticated emails based on SPF and DKIM verification results.
AWS SES Integration: AWS SES, as a reliable and scalable cloud-based email sending service, seamlessly integrates these authentication mechanisms. Its features, such as scalability, deliverability, and compatibility with industry standards, make it a suitable choice for organizations looking to enhance their email security posture. Real-time analytics and monitoring tools, combined with cost-effectiveness, ease of integration, and a developer-friendly interface, contribute to a comprehensive solution for secure email delivery.

In adopting SPF, DKIM, and DMARC with AWS SES, organizations not only safeguard their digital identities but also bolster user trust by ensuring that emails are genuine, secure, and delivered with integrity. This strategic combination lays the foundation for a secure communication channel, fostering a resilient defense against the evolving landscape of email-based threats. As the digital realm continues to advance, this approach stands as a testament to proactive and effective email security practices.

The best part is you don’t get the “Be careful with this message” warning anymore 🙂.

AWS Simple Email Service Security was originally published by Capgemini at Capgemini Software Engineering on July 12, 2024.

DCX React Library v1.0.0 is with us and we’re proud of it

2024-05-24T00:00:00+01:00

In 2022 we announced a new style-agnostic React library called dcx-react-library and today, after almost 3 years of hard work and 14 releases, we happy to share that version 1.0.0 is finally available.

What’s new

We’ve built more than 35 components organised in 3 different categories: forms, utilities and typography, keeping in mind our main principles:

fewer dependencies
small bundles
fully tested
AA accessible

I’m pretty sure the main question you’ll be asking yourself is: “why did you decide to make a major release version?” After using the library on our projects, we realised that we sped up the development process as we promised but being style agnostic by nature our development team was spending too much time making our components look pixel perfect and exactly how our UX/UI team designed. We asked ourselves: “now that we have got the foundations, can we do better?” Yes we can! So we decided to implement a design system for most of our components. Most of our components have been redesigned without introducing any breaking change adding the usage of tokens. We improved our documentation adding a new section called design system:

If the use of tokens has piqued your interest and you are curious to understand how it works, we have created a quick video that illustrates how to start using it:

What’s next?

Now that version 1.0.0 is finally available we will focus on implementing the design system for our remaining components, we’ll continue to implement new component (if your project needs components that we don’t have yet, please don’t be shy, create a request on our github repository and stay tuned… something else is cooking in the Capgemini labs.

You want to know more?

If you’d like to know more about the library, feel free to contact Daniele Zurico.
If you curious to give it a try, it’s publicly available on npm.
If you’d like to contribute, you will need to request access from our public repository.
If you want to familiarise with all the components we built, take a look at our storybook documentation.

DCX React Library v1.0.0 is with us and we’re proud of it was originally published by Capgemini at Capgemini Software Engineering on May 24, 2024.

Automation Scripting with Bash

2024-05-24T00:00:00+01:00

The need for automation in software engineering, especially in Cloud and Platform Roles is apparent. Manual, repetitive tasks consume time, introduce the possibility of human error and can become tedious for team members. In this blog post, I wish to document my journey of automation using Bash. We’ll look at the process and explore how automation can positively impact our workflows and our team.

Automating Simple Commands with Bash Scripts

For my first project at Capgemini, I have been working as a Platform Engineer within Cloud & Custom Applications. I quickly recognised the importance of automation in streamlining workflows. I took on the challenge of automating repetitive tasks, in the hopes of saving time and reducing human error.

Within my role, I find myself frequently using the same set of commands. For example, I often type vault read --field=value <path> | openssl x509 -noout -text, a command commonly used for reading and inspecting certificates stored in HashiCorp Vault. The symbol | is called a “pipe” in Bash scripting. It allows the output from the command on its left side to be used as input for the command on its right side. For example, in the command above the output of vault read --field=value <path> is passed to openssl x509 -noout -text for further processing.

This got me thinking: why not create a basic bash script that takes the vault path as an argument, which I could then alias as a command. That way, I could simply run <alias> <path> to fetch the certificate. Not only would this save me time, but it would be easier on the fingers.

So I set about doing this as below:

The first step in creating a script is to create a file ending in .sh, here we will create a file called vaultcert.sh.
Next, open the file in a text editor and define its functionality: The script accepts an argument for the vault path. This argument is passed as the path to the command vault read --field=value <path> | openssl x509 -noout -text.

#!/bin/bash

# Function to read certificate from Vault
read_certificate() {
    # Check if argument is provided
    if [ -z "$1" ]; then
        echo "Error: Vault path argument is missing."
        exit 1
    fi

    # Execute the command to read certificate from Vault
    vault read --field=value "$1" | openssl x509 -noout -text
}

# Invoke the function with the provided argument
read_certificate "$1"

3. We must set executable permissions to ensure the script is able to execute. We can do this by running in Linux:

chmod +x vaultcert.sh

4. An alias allows the script to be executed from a single command. We can do this by adding the following to our shell configuration file (e.g. .bashrc or .bash_profile):

alias vaultcert='/path/to/vaultcert.sh'

5. Now we must reload the shell configuration to apply the changes. We can do this by sourcing: source ~/.bashrc or source ~/.bash_profile.

6. Finally, confirm that the alias works by running the following command in the terminal:

vaultcert <path>

And that’s it you have your first automated bash script!

Automating More Complex Commands with Bash Scripts

Once I had my first bash script up and running, I challenged myself to automate more complex tasks. With Bash scripting, we can make tasks such as managing files much easier. The possibilities of streamlining workflows with Bash scripting and automating complex tedious tasks are endless.

For example, the next automation script I worked on expanded upon the first one, but allowed the user to copy the file stored in vault to their local machine and then write to a further vault path. All we have to do is follow the steps above and create a new script:

#!/bin/bash

copy_write_certificate() {
    # Check if arguments are provided
    if [ -z "$1" ] || [ -z "$2" ]; then
        echo "Error: Arguments are missing."
        exit 1
    fi

    # Execute command to copy certificate from Vault to local and write to new destination
    vault read --field=value "$1" > /path/to/local/file
    # Check if the file was copied successfully
    if [ -f "/path/to/local/file" ]; then
        echo "Certificate copied successfully to /path/to/local/file"
    else
        echo "Error: Certificate copy failed."
        exit 1
    fi

    # Prompt user to confirm writing the file to the new destination vault path
    read -p "Do you want to proceed with writing the file to $2 (y/n): " choice
    if [ "$choice" = "y" ] || [ "$choice" = "Y" ]; then
        vault write --field=value "$2" value=@/path/to/local/file
        echo "File successfully written to the new vault path."
    else
        echo "Cancelled writing to vault."
        exit 1
    fi
}

# Invoke the function with the provided arguments
copy_write_certificate "$1" "$2"

As demonstrated, Bash scripting opens up endless possibilities. We can increase the efficiency of our team by automating frequently used commands and sharing this with our team. Adding docs and sharing demos helps facilitate collaboration and boost productivity across the entire team.

We have explored my journey of bash automation on my first project here at Capgemini. In my next blog post, we will look at taking this one step further by introducing Python scripting, also discussing the use cases for Bash vs Python.

Thanks for reading!

Automation Scripting with Bash was originally published by Capgemini at Capgemini Software Engineering on May 24, 2024.

Devoxx 2024

2024-05-17T00:00:00+01:00

Do queues in the ladies’ toilets signify success at an IT conference? Is a half-hour sit-down worth more than a T shirt? These and other deep questions pondered.

Capgemini’s Cloud Development team are returning from our annual pilgrimage to Devoxx UK, the best and biggest annual IT developer conference in the UK. As is becoming usual, a train strike limited my attendance on the Wednesday, but even in a couple of days there has been much to inspire and mull over. This year, instead of a full booth, Capgemini sponsored a corner filled with super-comfy bean bags, and it was the most popular our area has ever been! As always it felt like a very inclusive conference, with a wide range of age groups and nationalities, and a good balance of genders and races present.

AI Again

As with last year there was a strong theme of AI across many of the conference talks, but this year the emphasis was more on developer involvement - less “Wow look what this can do”, and more “This is how you can build things”. Some angles were difficult to see a production use-case for - for example, Developing Intelligent Apps with Semantic Kernel showcased how with the Semantic Kernel Java API you can allow generative AI access to call your Java functions whenever it considers them appropriate. Having recently taken Capgemini’s “Responsible AI” training course and appreciating that generative AI’s decision making process cannot be considered “correct”, and so you could only use this scenario when it didn’t matter if the AI got the decision to make the Java call wrong, this really limits usage. Deciding when to delete development resources, perhaps? Or rotating a “thought for the day” message? Nothing more important, please! The speaker’s (inevitable) example of roleplaying games seemed safe enough. More AI concerns in Harel Avissar’s talk “The State of Malicious ML Models Attacks”. There are now popular repositories such as HuggingFace where people can share their pre-trained large language models (LLMs), which once trained are, as Harel pointed out, basically deployable units the same as any other software artifact. As such, they are open to attack in the same way as other pipeline artifacts and need to be signed and secured in the same way. This is quite a new field for attacks, but when scanning the LLMs available to download they did find some examples where LLM pickle (.PKL) files, which can run Python scripts when you open them, contained malicious “reverse shell attack” code. Something to be aware of.

AI and Vectors

Quite a few talks focussed on looking at vector databases, commonly used in LLMs, and the algorithms needed to search them. Vector types can represent the relationships between concepts across multiple dimensions. Mary Grygleski’s talk on Friday covered how ChatGPT’s vectors represent 1,536 dimensions. This would mean, for example, it could store 1,536 different contexts for the word “cat”. Handy. But not brand new, and not specific to generative AI - Elasticsearch for example has been using a vector database and the HNSW search algorithm to great effect in its fuzzy searches for years. In fact, as I discovered in my last blog post, the powers of fuzzy search and natural language processing are often more real-world useful than generative AI. You don’t usually want a computer system to come up with its own answer to a question, you want it to go away and find a definitive answer!

Thoughtful Architecture

Conference regular Andrew Harmel-Law was back with what some has said is his best talk yet - “How we Decide”. I didn’t attend as I had recently seen Dave Snowden’s talk on “being human in an age of technology”, in which he went into detail on the biological human process of decision-making - for example, did you know that groups of 5 or less will always reach consensus? Possibly because this is a common family group size, allowing ancient humans to move quickly in family units. And did you know that only 4 in 5 of your decisions are made by the conscious brain, with the other one coming pretty much straight from the body’s senses? Things like pheromones have a big impact on decision-making when face to face. I was still processing this information so skipped Andrew’s talk for a deep dive into container-based IDEs, but I feel I missed out and am looking forward to the video! I did attend Barry O’Reilly’s joyous “An Introduction to Residuality Theory”. Any talk that begins with a Douglas Adams quote and goes on to arm the attending developers with some REALLY DIFFICULT questions to put to enterprise architects (especially difficult if you don’t allow them to use the word “magic”) is always going to go down well, and O’Reilly went on to confirm my suspicions that the way we approach software architecture is less science and more art. He pointed out that our architecture diagrams don’t capture time, change or uncertainty - hugely important factors in software engineering - and how good software architects must be really comfortable with the concept of uncertainty. He had lots of great quotes about how Agile is a reaction to the realisation that requirements don’t work for complex systems, but that using Agile and reacting to change when problems arise leads to flaky architectures. His revelation, and the topic of his Ph.D thesis, is that software architecture could be a science, if we use methods from complexity theory. He went on to outline some really excellent ideas about how to robustify your candidate architecture using random simulation and ideas based on Kauffman’s networks and attractors. Plus, what to do if fire-breathing lizards should happen to climb out of the Thames. Forewarned is forearmed!

More microservices

Microservices haven’t gone away from Devoxx, and this year there were a couple of talks debating / re-debating the monolith v microservice argument. I especially enjoyed Chris Simon’s Modular Monoliths & Microservices - A 4+1 View Continuum in which he advised us all not to bother with the debate as both terms were becoming meaningless! Even a monolith is probably a distributed system. And if you view your client browsers as a scaled-out cluster of your application front end, which you probably should, even the simplest static website becomes a vast distributed network. He recommended moving away from the terms microservice / monolith and instead directly using terms such as “process” (a thing you can start and stop), and “node” for distributed compute, and teams and repositories for code ownership. He talked about the power of Domain Driven Design (DDD) when building up the logical view of your system, and how mapping processes onto bounded contexts can give you a candidate physical architecture, although don’t shy away from multiple processes in a bounded context if, for example, there are asynchronous steps between multiple processes (i.e. a queue). He also gave the quote of the conference when he said people often found they had a service and “distributed the wazoo out of it and we don’t know why”. He talked about the development view of an architecture and how your code repository should always align with the logical view. If this doesn’t happen you end up with inter team coupling - behaviours such as “Scrum of Scrums” mean that it is becoming very hard to get decisions made and is a clue that your development view isn’t in alignment. He also talked about how we should reconsider our assumptions on the physical view of the architecture - we often jump to the conclusion that a microservice needs its own infrastructure but this may not be the case, and we often assume that a repository must be deployed as a monolith when in fact you can use path filters on your monorepo to deploy separate sections at different times.

Fear of Rust

For something new, I went to see Ayoub Alouane talk about “My discovery of Rust: Why is it a Game Changer?” where he showed an example of Discord’s “Message Marker” app showing regular spikes in CPU that corresponded with garbage collection. The corresponding Rust application had no such spikes. Why? Rust, announces Ayoub, does its memory management at compile time! He went on to demonstrate with a simple loop application how, for each variable, you had to specify the memory details of the variable:

Is it mutable?
Is it usable in multiple threads?
Is it lockable?
Should you clone it into threads?

All this can be expressed through the syntax of the language. A bit of a pain to learn, but well worth it. A lot of end-of-life C applications are being migrated to Rust, and after this talk I can understand why. He gave another example of rewriting a node.js app in Rust and using 95% less memory and 75% less CPU! OK, game-changing point taken! This talk of application efficiency fit well with our own Shirley Tarboton’s talk on Decarbonising Code. She gave some examples of easy-to-implement coding practices that can make your application consume fewer resources. It was a well-received talk in the huge auditorium theatre - quite a feat to get people to attend during lunch!

Crashing and burning

Last thing Thursday I popped into a talk called Mayday Mark 2! More Software Lessons From Aviation Disasters.. As speaker Adele Carpenter pointed out, the Venn diagram of software engineers and aviation geeks is, “basically a circle”. Previous aviation disaster talks I’ve been to focus on ways of working we can both benefit from - for example, focussing on psychological safety, failing fast and failing openly. Adele’s talk, however, was a rather harrowing blow-by-blow account of some major air crashes from the past few decades. There were some important lessons about how much information humans can cope with, how humans react unexpectedly in stressful situations, the importance of familiarity and expectations in UI design, but overall it was a rather sobering affair. Which is just as well, since we all poured straight out of the talk and into the local pub for the famous Devoxx party, complete with IT-branded beers. Cheers for another year!

Continuous Change

On Friday I went to a talk very relevant to my current project - Chris Simon’s “Winning at Continuous Deployment with the Expand/Contract Pattern”. My project has a fairly typical Continuous Integration pipeline, but we still do production releases via manual intervention on a business schedule. One of the reasons is schema changes which cause backward incompatibility between services - this can cause outages if our server-side (message recipient side) services are updated before the client side (message sender) services - because even if they manage to process the old-style message, they may send a response the new client isn’t expecting. So whilst we have eventual consistency, a few requests may get lost along the way during the service restart. The Expand/Contract pattern addresses this by putting the onus on developers to create an “interim” server-side application which can accept both old and new client messages, and sends both old and new response formats. So if you used to have an application that took a username and returned OK like this

{
 "name": "Sarah Saunders"
}

{
 "response": "OK"
}

and the new version would take first name / last name and return OK and an ID:

{ "name": {
 "first": "Sarah",
 "last": "Saunders"
}}

{
 "id": "2134-ker-438052u"
}

In this scenario, the “interim” application would accept both inputs, and would return an output like this:

{
 "response": "OK",
 "id": "2134-ker-438052u"
}

This should allow client applications to talk to the server both before and after they are migrated to the new version. Then, once all clients are migrated, the “Contract” stage of the release occurs and the server can be moved to only accept/respond in a V2 style.

I really like this pattern and I don’t think developers would mind writing the extra code for the interim release. It does spell out though, that deployment must be a consideration right from the design stage of an application, and Chris did raise this point at the start - mentioning his other talk goes into more detail about how Test Driven Development (TDD) helps this by defining strong contractual interfaces between components, and is a strong enabler for continuous deployment. I have long been a strong advocate of TDD, and here is yet another reason why getting ChatGPT to write your tests for you is no substitute at all for proper TDD!

In conclusion, a really great conference for affirming that we (Capgemini Cloud Development team) are on the right lines with the software and architectures we are currently building for our clients, our use and understanding of AI aligns well with the global community (or is sometimes a little bit ahead) and we have lots of great lessons to learn to improve our architecture and deployment practices even further. Looking forward to the next year.

Devoxx 2024 was originally published by Capgemini at Capgemini Software Engineering on May 17, 2024.

Exploring MacOS Development: Creating a Menu Bar App with Swift & SwiftUI

2024-04-11T00:00:00+01:00

For a first foray into MacOS development, I built a custom menu bar application. The Hijri Menu Bar allows the user to view the current Hijri date. It is convenient, minimal and easy to use. Below, we’ll highlight the development process, looking at the code, build process and challenges faced.

Development

The development process began with setting up the project in Xcode. Swift and SwiftUI were chosen as the foundations for the app. Essential components such as the AppDelegate, which manages the app’s startup and background tasks, were incorporated, alongside a custom StatusItemView to handle the UI and what the user would see.

App Delegate

The AppDelegate class is responsible for setting up the status bar item, managing the popover and handling user interactions. It initialises a clickable item in the menu bar with a calendar icon.

An NSPopover is essentially a holder to display the contentView (a SwiftUI view). This popover will utilise a custom SwiftUI View to display the current Hijri date information.

Now we have a clickable item in the menu bar and a popover. We need to ensure when the item is clicked, the popover will appear on the screen.

To achieve this, the Button action and target lines establish the target-action pattern. Clicking the button will now trigger the togglePopover(_:) function (defined below).
This function decides whether to show or hide the popover, based upon it’s current state.

The togglePopover(_:) is a custom function. It first checks for the validity of both the button and popover. If the popover is already shown, it will close it.
If the popover is hidden, it fires a custom notification .fetchHijriDates. This will be received later in our StatusItemView (The SwiftUI view responsible for displaying the dates).

This function also displays the popover relative to the menu bar icon’s bottom edge.

Applications in MacOS can have different activation policies that determine how they behave in the user interface. Setting the activation policy to accessory mode means that the application will remain active in the background, even if the user closes its window or quits the application. This is useful for menu bar applications that need to remain running without requiring constant user interaction.

User Interface

After configuring the AppDelegate, the focus shifted to building the user interface. A new file is created called the StatusItemView. This will be responsible for displaying the current Hijri date in the popover created above.

This view manages the date fetching and presenting that to the user. The app starts with state variables which will store the current Hijri dates in Arabic and English.

Next, we utilise Text views to display the information stored in the state variables (i.e. the current Hijri date in Arabic and English with appropriate formatting.)

SwiftUI offers a horizontal stack HStack which is used here to display a copyright message and an information icon. Tapping the information icon toggles the visibility of an instructions menu (which is coded in a separate InstructionsView).

The app fetches the latest Hijri date in two ways: When the view first appears on the screen, it grabs the latest date for an accurate display. This is achieved using the onAppear method. However, this did not seem to refresh the dates subsequently, say every midnight, hindering the functionality of the app.

We can achieve this through utilising Swift’s NotificationCenter class. Here, a custom notification fetchHijriDates is triggered from the AppDelegate every time the popup is shown. The StatusItemView listens for this notification. When it is received, it fires the custom fetchHijriDates function (explained next) which will handle fetching the current date.

Behind the scenes, a custom fetchHijriDates function is responsible for fetching the current Hijri dates. It formats them according to Arabic and English locales and updates the state variables created earlier with those formatted dates. It is called both when the view appears and in response to the fetchHijriDates notification, discussed earlier.

Here we are able to leverage Apple’s built-in Date and Calendar functionality, using Calendar(identifier: .islamicUmmAlQura). This (satisfying) pre-built functionality saves us time and ensures accuracy and uniformity in our date handling.

This is the end of our discussion on the code. The full code can be found on the GitHub repo.
Below we will look at the rest of the development & build process.

Build Process & Challenges

Building the app in Xcode involved compiling the code, resolving errors and creating a distributable package. This process also involves setting the minimum macOS version, which is set to 12 for this app.

App Icon

Creating a visually appealing icon was important. A custom icon was designed and added to Xcode in various sizes to ensure it adapts to different screen resolutions. I used this free online tool to achieve this.

Distribution

Distributing through the Mac App Store, requires a paid Apple Developer ID, so I chose to make it freely available on GitHub. Subsequent releases were uploaded via GitHub Releases.

Contemplating Further Features

Features like automatic launch at login were initially considered. Launch at login ensures the menu bar icon is always visible (i.e. that the app is started after the user logs in/restarts their machine). However the focus remained on core functionality and a simple user experience. Instructions are provided to users during installation to manually set launch at login.

Creating DMG File

To ensure smooth installation, I created a DMG file, using the create-dmg tool. Clear and concise instructions were included to guide users through the installation process. Using the downloaded DMG file, users can drag the app to their Applications folder, and launch it from there.

Conclusion

This journey, from concept to distribution, has been a valuable learning experience in macOS development. I hope to continuously improve the Hijri Menu Bar app and provide users with a seamless way to view Hijri dates on their Macs.

If you’re interested in trying out the app or contributing to its development, you can find it on GitHub.

Exploring MacOS Development: Creating a Menu Bar App with Swift & SwiftUI was originally published by Capgemini at Capgemini Software Engineering on April 11, 2024.

How to (maybe) replace your HR department in 3 easy steps

2024-01-19T00:00:00+00:00

I recently had the privilege of judging an internal Capgemini hackathon. It was an open brief, but the focus was to be on technology and its application to solve a real-world business problem. The entries were varied and excellent, from a dashboard to assess how warm/busy/accessible the office was so you could decide whether or not it was worth going in, to gamification of training, to improvements for mountain search and rescue teams. One of the major commonalities across many of the entries was the use of “AI”, where, given our common use of the Azure platform, AI tended to be defined as Azure cognitive search (recently renamed Azure AI) indexing a set of business documents, with a natural language processing unit on top to act as a chatbot. This made me want to have a go myself, and see what I could build!

Is it AI?

This architecture, for me, isn’t really using the “AI” bits of AI - despite the fact that if you use ChatGPT (which is available as a product in Azure since Microsoft’s purchase of OpenAI) there can be some non-deterministic, generative functionality, but it sure is useful and could probably ease the burden of the HR and support departments of many organisations - and could possibly even replace a lot of the staff in these departments. I set out to see if I could build a HR chatbot to replace the kinds of queries a typical HR department employee might need to deal with. Why HR? Just because everybody hates them?? No…! - it’s because of the remit of HR, dealing with the employee lifecycle and needing to prove that a company acts without bias it must be a heavily process-driven department. These processes must be documented, and most of the workload of the department is in dealing with queries regarding the process. The incoming questions are probably not phrased in the same way as the process documentation, so some sort of fuzzy search is required in order to automate the question-answering process; for example, translating “how much paid time off do I get when my baby is born?” to “paternity leave allowances” is not a straightforward mapping. This is the reason that previous attempts to automate such departments have failed. Language is too complex for simple mappings and decision trees to replace a person on the end of a line - as anyone who has tried to navigate an automated telephone call will tell you. Who hasn’t ended up shouting “I WANT TO SPEAK TO A PERSON” down the line? But at the end of the day, the workload is simply regurgitating content from a document repository and the hard bit comes in finding the relevant sections - a process that is better automated as it’s a pretty unrewarding job acting as a knowledge base for people who can’t be bothered to read swathes of documentation.

The Architecture

As mentioned, the hackathon had been playing with the Azure cloud, so we’ll keep to this and use Azure’s concepts. These are pretty simply translated to any hyperscaler though, or to open-source alternatives if you want to host your own. For example, on AWS you could use Textract and Amazon Comprehend, and in the OSS world you’d perhaps use NLTK and Lucene.

Azure AI Search is a nice tool - a little more than document search, a little less than AI. It can be a bit clunky to get used to, and the price policy is per GB storage which is pretty bizarre - but this can be beneficial if you have query-intensive applications and a small-ish data set of documents. We use it as an exotic database view for one of our applications, and it took us a while to get used to the fuzzy query syntax - it’s not really designed for logical queries, it’s much better at giving you best-guess matches for loose search terms - and as such is well positioned to be the back-end of our HR chatbot.

The Method

I found a couple of tutorials and quick-starts to create chatbots on my documentation - Azure Search OpenAI demo or Query your documentation using Langchain

The issue I found is that it’s all moving quite fast - faster than the tutorials can keep up with. All mention of Langchain has now gone from the Azure portal (although you can still write your own Langchain chatbot), and QnA maker has now moved on and we have Azure AI Language Studio where you can add in your documents via a “Custom Question Answering” project, which is a type of Azure “Language” and can be created via the LoCode/NoCode Language Studio homepage. The tutorial speedily guides you through a simply-configured web form although it’s not quite clear what you are actually going to create - looking at what was deployed after the configuration steps, this sets up an Azure cognitive search (AI search) repository and then enables custom text classification / custom named entity recognition on the repository. The default behaviour for this appears to be breaking down the content in your referenced documents into paragraphs and pulling out likely titles/subjects. You can then modify this classification by adding in new questions and answers, or choosing the best answer for given terms.

The free trial only allows you to upload three sources into your AI search repository. So, for our HR example, I’ve downloaded three HR policy documents from this handy online repository and added them into my Custom Question Answering repository. This generates a “Knowledge Base” that I can then publish.

Here we can see the way that the content has been divided up into major terms and paragraphs that may address those terms. I can edit here, and once it’s published I can generate a Bot to act as the user interface to it.

OK so now onto creating this Bot. As Bots go, OpenAI’s ChatGPT is the real deal. Generative AI, pre-trained to recognise vast arrays of English language. For most use cases we would have to “turn off” all the fun, generative stuff for our application (see Guardrails below) and it’s probably overkill to use ChatGPT for this demo - plus, it isn’t included in the Azure Free Trial tier so I will be experimenting with the Azure AI Bot Service instead. It should be sufficient for this fairly small and simple demo. Cost-wise, the Azure AI Bot has a free tier, but it must be hosted via an Azure Web App whose service plan is defaulted to S1 (Standard). This plan, at £75/month to keep it running, is eating rapidly into my free credit!

Configuring the Bot online is pretty straightforward. The web GUI provides you with a customised template for creating the resources that you will need, creating an App Service Plan to launch an Azure WebApp that will host your Bot. The only config you have to do is enter the key of your Language Resource so that you can create a secure connection between the AI service knowledge base and the chatbot. This isn’t documented, but you can find the key by going back to the Azure Portal home and clicking the green squares “All Resources” view, then selecting your Language resource (the resource where Type = Language) and then selecting the Keys and Endpoint menu item. (There are two keys, so that you can refresh them by rotating them individually and hence avoid downtime. Either one is fine.)

Testing

Once your Bot is deployed, you can test it by finding it under All Resources and choosing “Test in Web Chat” from the right hand menu.

I tried with a simple question, that I know is answerable with the content in the documents:

So far so good. The Bot has successfully found the right bit of my documentation and returned a comprehensive and understandable answer. How about another:

Oh dear. “cannot” is not exactly a strong English sentence! Although it has found the relevant section of the documentation, it has not been able to pull out a contextual answer. I am not sure if it is the Language Service or the Bot which is struggling with this question. Enabling and examining the logs on the Bot Service isn’t that helpful - it just shows HTTP POST requests going to the Bot framework. The Bot framework should be responsible for breaking down the user’s entered text into logical “intentions” that the back-end question-answerer can respond to, and then delivering the back-end response in a human readable form. I eventually figure out how to enable logging on my Language Service and discover the query and response that the Bot has sent to the language service:

I can see that the language service has actually done a reasonable job. It’s identified the right paragraph for the query, but returned just a 38.97% certainty rating that this is the right data. Fair enough. So it seems that the issue is with the Bot being able to pull the right piece of text out of the response. This makes me start to wonder about the “Bot” I have deployed. What is it actually based on? There isn’t much documentation I can find, but you can download the source code, which shows that I have deployed something created by the BotBuilder SDK. I should be able to run this locally, but weirdly the Bot JavaScript code in my download seems totally out-of-date with the latest Language Studio API. I have to go back to the drawing board and use one of the later samples and update the code to correctly declare a method asynchronous to get the Bot running locally using the Bot Framework Emulator.

To get it to work using Node.js v18.16.0 and restify ^11.1.0, I had to edit the sample code index.js line 91 to declare the method async or it would not start:

// Listen for incoming requests.
server.post('/api/messages', async (req, res) => {
    adapter.processActivity(req, res, async (turnContext) => {
        // Route the message to the bot's main handler.
        await bot.run(turnContext);
    });
});

I was then able to run the Bot locally connecting to my Azure-hosted Language via the Azure Bot Framework Emulator. And of course as luck would have it, the latest sample doesn’t return such a poor response! It’s still not perfect, but it’s at least a sentence. See below.

It does also prove that the poor response here was the chatbot interpreting the data from the Language Service. The Language Service will return a field called an AnswerSpan which lists, with a confidence score, the section of the documentation it considers most relevant to the question. In the case of my “dismissal” question, the AnswerSpan returned was:

An employee whose appointment is terminated for any reason will be provided with a written statement of the reasons for the dismissal

This text was paired with a confidence score of 0.2880999999999997, or circa 29%. Fair enough. So how the cloud-deployed bot extracted the answer “cannot” from this is a bit of a mystery! The new version of my Bot prints the whole AnswerSpan and is, whilst still not exactly accurate, at least better. So how do I fix it?

Customisation

It seems the way to fix up these simple Bots is to go and add a custom question/answer into the Language Service knowledge base. I try adding a specific answer to the question, “Can I appeal against my dismissal?”. I re-publish the knowledge base and try again.

This looks much better. But it does imply that quite a lot of user testing and customisation will have to take place before this Bot is ready to replace its human counterparts.

Guardrails

One of the things that surprised people about ChatGPT, particularly in its earlier iterations, was that it was not trained to be accurate. It was trained to please the user. This would mean it would return inaccurate answers above telling you that it didn’t know the answer, as it had gauged higher satisfaction from “lying”! You don’t want your HR chatbot to lie, so you must use the guardrail settings to ensure that it does not. With ChatGPT, guardrails can be set using natural language, for example you can state:

{"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions.`

Instructions:
- Only answer questions related to taxes.

- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information. "},
{"role": "user", "content": "When are my taxes due?”}

This configuration will prevent the chatbot from “making up” an answer if it cannot find a decent response in its repository. Configuring Azure’s ChatGPT chatbot via the GUI, to achieve the above you turn the setting known as “temperature” down to 0. The temperature represents how creative the chatbot can be in getting you an answer. A low temperature results in more “I’m sorry I don’t know” type answers, but increases the chances you’ll get an accurate answer, and that you’ll get the same answer when you ask the same question twice!

The Cost

So what does this cost to run in Azure? Depending on your Bot type, the cost can vary wildly. As mentioned, I am running my Language instance and my Bot instance in the free trial tier, so I am only paying for the app service to host them and this is around £75/month. If you were to use an enterprise ChatGPT Bot, costs are over £800/month fixed rate for 40 users, plus 80p per “usage unit” and £20 for any extra users over and above the plan. Still considerably cheaper than making your HR staff deal with these queries, I suppose.. As mentioned, Azure AI search is priced per GB of data indexed, the free tier runs up to 50 GB, Standard tier gives you 25 GB for 27p/hour.

In Conclusion

I am impressed with the Azure AI search offering - it’s powerful and useful - there are so many scenarios whereby we end up awash with documentation and cannot find the content we need. The chatbots are a varied bunch but I liked the way you could download the code and run/edit it locally with relative ease. In all, I feel this will be a very common architecture for the business problems of the next year or so.

How to (maybe) replace your HR department in 3 easy steps was originally published by Capgemini at Capgemini Software Engineering on January 19, 2024.

Keeping Spring Boot Apps Secure With HashiCorp Vault

2023-07-07T00:00:00+01:00

I have previously written blog posts about securing Spring Boot configuration with standard Kubernetes resources. In this post I’m going to take it a step further with a more productionised pattern of securing Spring Boot microservices with Vault in Kubernetes.

Keep It Secret, Keep It Safe

As a famous wizard once said; keep it secret, keep it safe. Whilst this applies to rings and other precious objects, it also applies to the sensitive data that we keep within our applications.

Security is paramount in productionised applications, often being one of the more challenging patterns to implement correctly.

Not All Secret Stores Were Created Equal

As I spoke about in my previous post, using Kubernetes secrets for storing sensitive data is considered bad practice for two main reasons:

The secrets themselves are stored in base64 format, which provides minimal security on its own.
By default, secrets are stored unencrypted in the underlying API’s data store (etd), meaning anyone with API access can retrieve and modify them.

There are several alternatives to using Kubernetes default secrets, and one of the most widely used tools is HashiCorp Vault. Vault is an identity-based secrets and encryption management system, that provides encryption services protected by authentication and authorization mechanisms. This makes it a much more secure way to store sensitive data. Additionally, Vault offers integration and authentication mechanisms with Kubernetes out-of-the-box, providing a proven and secure approach to managing secrets within your Kubernetes cluster.

Secure Doesn’t Mean Complex

Whilst being a challenging pattern to get right, security doesn’t need to be complex. Let’s walk through a simple example of how to set up the Kubernetes auth method locally, and retrieve secrets from a Spring Boot application using Spring Cloud Vault.

Configuring Vault

To begin with, we will configure Vault locally. HashiCorp has a set of available helm charts that you can apply, in order to test and work with Vault. For our example, we will be using the vault helm chart.

We will use Helmsman to manage our helm deployments. If you are interested in learning more about Helmsman, I recently wrote a blog post about it.

Our dev state file looks like this:

helmRepos:
  vault: https://helm.releases.hashicorp.com

apps:
  ...
  vault:
    namespace: dev
    enabled: true
    chart: vault/vault
    version: 0.24.1
    valuesFile: values/vault/values-dev.yaml

This will tell Helmsman to deploy the vault helm chart into our local dev namespace using the values file located at values/vault/values-dev.yml. The values file contains some simple overriding configuration to enable development mode for Vault. In doing so, it allows us to experiment with Vault without needing to unseal or store keys against it (Note: This should not be done in a production environment).

server:
  dev:
    enabled: true

Kube Auth Method

After applying the helmsman state file, we can proceed with configuring the Vault instance. There are two main ways to configure Vault: through the Vault UI or programmatically via the CLI. Since we prefer repeatable processes, having our Vault configuration in code is a better approach. Taking it a step further, we could use the Vault Terraform approach to treat this configuration as infrastructure-as-code. However, that goes beyond the scope of this example.

Configuration

Next we need to enable the Kube auth method. The easiest way to do this (programmatically) is via the Vault CLI (which comes pre-installed in the Vault container from the installed helm chart).

vault auth enable kubernetes

After enabling this feature, we need to configure the auth method to work with our local kubernetes cluster. There are several ways to configure this, but with the changes introduced in Kubernetes 1.21, there are some documented and recommended approaches. It’s worth reading through the different approaches and understanding their differences. However, for the purpose of this example, we will be using a local service account as the reviewer JWT because we have Vault running locally in a pod within our cluster.

To enable this configuration, we can run the following command:

vault write auth/kubernetes/config kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT

This configures the Vault auth method to use the service account token running in the Vault pod itself. This works because the Vault pod is running in our local cluster, the same cluster that Vault will be authenticating against later on when we send requests from our sample service. If Vault was running as an externally managed service (which is typical in a production environment), this approach wouldn’t work, and we’d have to configure the auth method using a more robust approach, such as using the Vault client’s JWT as the reviewer token or possibly using long-lived tokens.

Roles

Now that we have enabled and configured our auth method, we can proceed to add the other important pieces of configuration. Firstly, we need to configure the role against the authentication method. We will create a role that allows our Spring Boot application to retrieve secrets from our Vault instance.

` vault write auth/kubernetes/role/demo bound_service_account_names=’*’ bound_service_account_namespaces=dev policies=spring-boot-demo
`

The role above will be called demo, and it will be bound to any service account (for finer grained security you would usually limit this to a specific account), it will be bound to our dev namespace and will have a policy attached to it named spring-boot-demo (more on this later).

Each Kubernetes auth method can have any number of roles created against it. The purpose of these roles is to restrict each integrating service to a specific set of secrets through roles and policies. The role component of this configuration determines which service(s), bound to which service account(s) can authenticate against this method (the auth aspect). The attached policy determines which secrets that service account(s) (and consequently service(s)) can access.

Policies

Vault policies define the fine-grained, path-based access to specific secrets held within Vault itself. The policy we’re using for this example looks like this:

path "kv/spring-boot-demo" {
  capabilities = ["read"]
}

path "kv/spring-boot-demo/dev" {
  capabilities = ["read"]
}

path "auth/token/lookup-self" {
  capabilities = ["read"]
}

path "auth/token/create" {
  capabilities = ["create", "read", "update", "list"]
}

This policy will give access to secrets held at kv/spring-boot-demo and kv/spring-boot-demo/dev, it also has some default Vault policies which allows the JWT token lookup to occur during login and authentication. For secret lookups, we only need to provide read access because our service will only be trying to get specific secrets, not create or update them.

The following diagram gives a high-level view as to how Policy look-ups and authorisation occur.

Secrets

Lastly, we need to enable a secrets engine and create a secret for our application to use. For this example, we will be using the Key-Value secrets engine. The following CLI command will enable the KV engine for us, with a name of kv (this should look familiar from our policy outlined earlier).

vault secrets enable kv

Next, we can put a secret into our new kv store:

vault kv put kv/spring-boot-demo/dev admin=password

Now that we have our Vault instance configured with the kube auth method, a role, an appropriate policy and secret data, we can integrate a sample application to test it.

Spring Cloud Vault

To test our Vault configuration and close the loop with our example setup, we will use a Spring Boot microservice, which has endpoint security configured with Spring Security. For this demo, we will be using actuator which only exposes /health by default for security reasons. Let’s expose some actuator endpoints that could contain sensitive information such as /env and /heapdump, and secure them with spring security.

Securing Our Endpoints

In Spring Boot it’s fairly straightforward to enable various actuator endpoints. Spring provides a management config block, which allows developers finer-grained control over which endpoints are exposed, and also which sub-sets of information are exposed at those levels.

management:
  endpoint:
    ...
    env:
      enabled: true
    heapdump:
      enabled: true

We will be enabling the env and heapdump endpoints as mentioned above, this means we could be exposing sensitive information about our service if not secured correctly.

In order to secure the actuator routes properly we need to implement spring security. A simple pattern I like to follow is to split my routes into secure and insecure, allowing pass-through traffic for any non-secure route, and then handling secure routes with appropriate role-based access controls. Our configuration will look like as follows, including a ‘management’ style user for access purposes.

application.yml

appsecurity:
  management:
    username: ADMIN
    password: ${admin:test}

securedroutes:
  management:
    - "/actuator/shutdown"
    - "/actuator/loggers/**"
    - "/actuator/heapdump"
    - "/actuator/env"
  unprotected:
    - "/actuator/info"
    - "/actuator/prometheus"
    - "/actuator/health/**"
    - "/hello"

The config above allows us to use a placeholder value for our management user password. This is useful for unit test purposes where we don’t want to create another application.yml test resource file. If we don’t supply a value at runtime, the default value of test will be used.

In order for our application to use this configuration, simple configuration properties can be used to map the values to a configuration class:

@Getter
@Setter
@Configuration
@ConfigurationProperties(prefix = "securedroutes")
public class SecuredRoutesConfig {

  private String[] management;
  private String[] unprotected;
}

Our Spring security config will configure our application to allow any requests accessing non-secure routes to pass-through without any auth checks, whereas any requests to our secured routes will be subject to authentication and authorisation checks. An example of this config is show below.

private final SecuredRoutesConfig securedRoutesConfig;

@Value("${appsecurity.management.username}")
private String managementUsername;

@Value("${appsecurity.management.password}")
private String managementPassword;

@Bean
public PasswordEncoder encoder() {
return new BCryptPasswordEncoder();
}

@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http
    .formLogin().disable()
    .csrf().disable()
    .authorizeHttpRequests((requests) -> requests
        .requestMatchers(securedRoutesConfig.getUnprotected()).permitAll()
        .requestMatchers(securedRoutesConfig.getManagement()).hasRole(ROLE_MANAGEMENT_USER)
    )
    .httpBasic(withDefaults());

return http.build();
}

@Bean
public UserDetailsService userDetailsService() {
UserDetails user =
    User.builder()
        .username(managementUsername)
        .password(encoder().encode(managementPassword))
        .roles(ROLE_MANAGEMENT_USER)
        .build();

return new InMemoryUserDetailsManager(user);
}

Configuring Vault

In order to configure our Spring Boot service to integrate with Vault, we need two key parts; the Spring Cloud Vault library and our application configuration to integrate with Vault itself.

Adding the following library to the POM file gives us the full spring-cloud-vault implementation:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-vault-config</artifactId>
</dependency>

And the following configuration enables our application to integrate with Vault:

spring:
  config:
    import: optional:vault://
  cloud:
    vault:
      enabled: ${vault-enabled:false}
      application-name: spring-boot-demo
      connection-timeout: ${vault-connection-timeout:5000}
      read-timeout: ${vault-read-timeout:15000}
      authentication: KUBERNETES
      kv:
        backend: kv
        enabled: true
        profile-separator: '/'
        application-name: spring-boot-demo
        default-context: spring-boot-demo
        profiles: dev

Some of the configuration above might already start to make sense based on how we configured our Vault instance earlier. The main aspects to point out are the kv engine configuration;

backend: kv - this tells Spring Boot the name of the kv secrets engine to lookup in Vault
profile-separator: '/' - this tells Spring Boot the path separator used in the secrets engine, e.g. kv/
application-name: spring-boot-demo - this tells Spring Boot the naming convention of the secret lookup, e.g. kv/spring-boot-demo
profiles: dev - this refers to the active profile Spring Boot is running, as Spring Cloud Vault uses that profile to determine the secret path to use, so dev would give us kv/spring-boot-demo/dev.

Bringing It All Together

Given that we have a local Vault instance set up and a Spring Boot service to integrate with it, we can deploy our app and test the successful retrieval of secrets to secure our application.

Firstly, we will add our Spring Boot service to our Helmsman desired state file.

apps:
  spring-boot-demo:
    namespace: dev
    enabled: true
    chart: '../service-helm-chart'
    version: 1.0.0
    valuesFile: values/service/values-dev.yaml    
  vault:
    ...

Then we can apply the updated state file:

helmsman --apply -f dev.yaml

Once the new Spring Boot service is running successfully, we can test the actuator endpoint has been secured properly with our secret we set up in Vault.

> kubectl get deploy -n dev
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
vault-agent-injector     1/1     1            1           64s
spring-boot-vault-demo   1/1     1            1           65s

We can port-forward to the running pod to establish a localhost connection and conduct some basic cURL tests. When calling a secure endpoint without any authentication using cURL, we should receive a 401 response.

kubectl port-forward deploy/spring-boot-vault-demo 8080:8080 -n dev

> curl http://localhost:8080/actuator/env -v
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /actuator/env HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 401
...

Now, when we use cURL to call the same endpoint while providing the authentication secret stored in Vault, we should receive a 200 response, along with the JSON payload that outlines the environment properties stored in the service.

> curl http://localhost:8080/actuator/env --user ADMIN:password -v
*   Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
* Server auth using Basic with user 'ADMIN'
> GET /actuator/env HTTP/1.1
> Host: localhost:8080
> Authorization: Basic QURNSU46cGFzc3dvcmQ=
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
...
{"activeProfiles":["dev"],"propertySources":[{"name":"server.ports","properties":{"local.server.port":{"value":"******"}}},...

And that’s it! All working as expected.

Conclusion

Keeping sensitive information secure in production systems is paramount. With the vast number of data breaches last year, which caused chaos for those who fell victim, ensuring data security and mitigating attack vectors is critical for engineering robust, well-designed systems. While this post has outlined a simpler approach to integrating a Spring Boot microservice with a secrets management solution, it hopefully demonstrates that it doesn’t have to be incredibly complex to get it right.

You can see all the code to accompany this post over on my github.

Keeping Spring Boot Apps Secure With HashiCorp Vault was originally published by Capgemini at Capgemini Software Engineering on July 07, 2023.

Navigating Kubernetes Deployments With Helmsman

2023-05-22T00:00:00+01:00

Kubernetes is one of the most popular open-source container orchestration frameworks in use today. It allows you to easily deploy, scale and manage containerised applications. As your applications grow, the number of Kubernetes resources you have to manage increases, and that’s where Helm comes in. Helm is a package manager for Kubernetes, allowing you to define, install and manage complex Kubernetes clusters at scale. However, unless you want to install all of your helm charts individually (and possibly manually), there is a need for an automated, infrastructure-as-code approach. Enter Helmsman.

The Problem

As mentioned above, in a productionised domain, the set of deployed services and their accompanying resources will grow exponentially. Even when using a package manager like Helm, the sheer amount of deployable resources and packages can become hard to manage.

If you have ten Helm charts to deploy, you could be running ten install and/or upgrade commands to reach the desired cluster state for any given environment. Furthermore, if you have multiple environments (dev, test, preprod, prod etc), you then have ten commands per environment to run - you can quickly see how this could become difficult - not to mention inefficient - to manage.

An Introduction to Helmsman

Helmsman is a tool which allows you to define the desired state of your Kubernetes cluster in code, giving you the ability to deploy, upgrade or destroy that state in a single command. Each environment (namespace traditionally in Kubernetes) has its own state file, making managing versions and resources across environments much simpler.

As a result of Helmsman encapsulating the state of your cluster(s) in code, you can easily describe the state of any cluster by looking at the Helmsman desired state file. This makes it easier to manage what’s deployed, where and at which version.

A Helmsman Story

Let’s take an example where we have a service domain which contains four microservices. Each microservice has slightly different resources requirements (CPU/Memory) and two of them are required to integrate with a database. In non-production environments (dev, test) they are not required to be highly-available, whereas in production environments (preprod, prod) they are.

Basic Helm Chart

We’ll create a Helm application chart that can define the Kubernetes resources required for each of our services. Our example service chart will contain some standard Kubernetes resources such as a deployment and network policy.

metadata:
  environment: replace-me

deployment:	
  create: false
  replicas: 1
  name: replace-me
  image: replace me 
  ports: 
    - 8080
  resources:
    requests: 
      memory: "250Mi"
      cpu: "250m"
    limits: 
      memory: "350Mi"
      cpu: "300m"

networkPolicy:
  create: false
  podSelector:
    matchLabels:
      app: replace-me 
  policyTypes:
    - Egress
  egress: {}

The above is heavily simplified from what a real production chart may look like, but the purpose here is just to give an example to work from later.

Above you can see a create: false property on each resource, this is a practice I tend to follow when building Helm library charts, as it gives implementing charts the ability to opt-in to whichever resources they need, and not just get them implemented by default.

Microservice Setup

Each microservice will have it’s own implementation of the base chart shown above. Let’s first use microservice-a as an example, which has no extra resource requirements, and no database connectivity.

Chart.yaml

---
apiVersion: v2
name: service-a
description: Chart for microservice A
version: 0.1.0
dependencies:
  - name: base
    version: 1.0.0
    repository: "@base-repository"

values.yaml

base:
  deployment:	
    create: true
    replicas: 1
    name: service-a
    image: service-a:1.0.0

As you can see above, microservice-a has a very simple implementation of the base chart, mostly using the default values provided.

Now let’s look at microservice-b. This service will have slightly higher resource requirements and will also need egress networking out to a MySQL database (running in a pod in the same namespace).

Chart.yaml

---
apiVersion: v2
name: service-b
description: Chart for microservice B
version: 0.1.0
dependencies:
  - name: base
    version: 1.0.0
    repository: "@base-repository"

values.yaml

base:
  deployment:	
    create: true
    replicas: 1
    name: service-b
    image: service-b:1.0.0
    resources:
      requests: 
        memory: "500Mi"
        cpu: "350m"
      limits: 
        memory: "550Mi"
        cpu: "400m"

  networkPolicy:
    create: true
    podSelector:
      matchLabels:
        app: service-b
    policyTypes:
      - Egress
    egress:
    - to:
      - podSelector:
          matchLabels:
            app: mysql

Helmsman Implementation

Now let’s look at the Helmsman implementation and how it makes dealing with multi-service deployments simpler.

Our very simple Helmsman folder structure will look as follows (showing only service-a and service-b for brevity):

.
├── dev.yaml        
├── test.yaml       
|── preprod.yaml    
|── prod.yaml       
└── values
    ├── service-a
        └── values-dev.yaml       
        └── values-test.yaml       
        └── values-preprod.yaml   
        └── values-prod.yaml       
    ├── service-b
        └── values-dev.yaml        
        └── values-test.yaml       
        └── values-preprod.yaml    
        └── values-prod.yaml       

Let’s look at a desired state file and one of the values files for each service in a bit more detail to show what’s happening.

As mentioned previously, Helmsman provides a way of describing the desired state for your Kubernetes cluster. In the example we’re using, we’ve got two clusters; non-production (containing dev and test namespaces) and production (containing preprod and prod namespaces).

Let’s take a look at the dev.yaml state file;

metadata:
  description: Desired State File for the dev environment

namespaces:
  dev: 

helmRepos:
  stable: http://custom-helm-repo-example.com

apps:
  service-a:
    namespace: dev
    enabled: true
    chart: stable/service
    version: 1.0.0
    valuesFile: values/service-a/values-dev.yaml    
  service-b:
    namespace: dev
    enabled: true
    chart: stable/service
    version: 1.0.0
    valuesFile: values/service-b/values-dev.yaml     

There’s a few bits going on in the above state file definition, so let’s break it down.

The namespaces property allows you to define the namespace(s) you have or want as part of this state definition. If the namespace(s) don’t exist when you run Helmsman, it will create them for you.

namespaces:
  dev:

The helmRepos property allows you to define the Helm repositories where your packaged charts are stored. There are several options for chart repositories, such as; default, private (backed by Google, AWS or basic auth) and local.

helmRepos:
  stable: http://custom-helm-repo-example.com # This doesn't exist, it's just shown for example purposes

The apps block is the most important block within the example state file shown above, it defines all the services you want deploying as part of this state file. Helmsman is very powerful and provides a lot of configuration options for deploying apps and configuring them. In the example above, we’re using a simple app definition for each service.

apps:
  service-a: 
    namespace: dev 
    enabled: true 
    chart: stable/service 
    version: 1.0.0 
    valuesFile: values/service-a/values-dev.yaml

An important property defined above is the valuesFile property, this tells Helmsman where the values file to be installed as part of this release is located within the Helmsman structure.

As displayed previously, our Helmsman file structure contains the following files;

└── values
    ├── service-a
        └── values-dev.yaml        
    ├── service-b
        └── values-dev.yaml        

So when we’re specifying the valuesFile property as values/service-a/values-dev.yaml it’s referring to the following folder

└── values    
  ├── service-a
      └── values-dev.yaml        

Now let’s look at the contents of those files - this is where the modularisation within Helmsman really shines.

Earlier on we stated that Service A doesn’t have any additional requirements beyond the standard chart specification. Whereas Service B had the additional requirements of higher resources and a connection to a MySQL database. With that being said, let’s look at the values-dev.yaml definition for these services

Service A

Service A only needs to specify the environment it sits within and some basic information about the deployment; name, image and container port, everything else is already defined in the base service chart that we’re using (as defined in the Helmsman dev.yaml state file).

metadata:
  environment: dev

deployment:	
    create: true
    name: service-a
    image: service-a:1.0.0
    containerPort: 8080

Service B

Service B on the other hand, needs a bit more configuration to meet requirements.

metadata:
  environment: dev

deployment:	
  create: true
  name: service-b
  image: service-b:1.0.0
  containerPort: 8080
  resources:
    requests: 
      memory: "500Mi"
      cpu: "350m"
    limits: 
      memory: "550Mi"
      cpu: "400m"

networkPolicy:
  create: true
  podSelector:
    matchLabels:
      app: service-b
  policyTypes:
    - Egress
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: mysql

For the Service B values-dev.yaml file we have specified the environment, deployment and networkPolicy configuration values. This has allowed us to override and add to the values that are defined in the base service chart we’re using as part of this deployment.

As our project grows, we can easily add more services to our desired state file(s), making the management of our environments much simpler than if we had to manage all the helm charts individually.

Bringing It All Together

So now we have our example Helmsman project setup, with our desired state file(s) ready to provision services into our cluster. All we need to do now is issue certain Helmsman commands and we’ll have our services running in no time. Ideally, you’d run Helmsman from CI pipelines, but that goes beyond the scope of this post. We’ll now take a look a few of the more widely used commands.

Dry Run

A really useful feature of Helmsman is the ability to use dry-run. This allows you to point Helmsman at one of your desired state files and do a dry-run installation against your cluster. The benefit of this is you get to see the rendered Kubernetes manifests that would be installed, and can easily verify and validate that the manifests to be installed are correct, without them actually being installed.

helmsman -f dev.yml --dry-run

Apply

Next up is the apply command. This applies your desired state file to your kubernetes cluster, installing all the resources via Helm.

helmsman -f dev.yml --apply

Destroy

Another useful command is the destroy command. This tears down your cluster based on the desired state file - this is useful if you want to tear down environments quickly or nightly to save costs.

helmsman -f dev.yml --destroy

Wrapping Up

Although this post has only shown a very simple example project, hopefully you can see how Helmsman is a very useful tool for managing our Kubernetes environments. As service domains grow, so do the amount of resources we need to keep track of and implement to keep everything ticking along. Rather than trying to keep a handle on all of those resources manually, it’s better to leverage specific tooling (like Helmsman) to provide consistency, efficiency and a much better developer experience!

Helmsman is just one approach to managing your kubernetes environments, and is a good entryway to more GitOps style approaches such as FluxCD or ArgoCD (among others).

You can see all the code for an example service scenario like the one described in this post over on my github.

Navigating Kubernetes Deployments With Helmsman was originally published by Capgemini at Capgemini Software Engineering on May 22, 2023.

Preparing for Devoxx

2023-05-10T00:00:00+01:00

It’s a big deal preparing for sponsorship of a conference. Each year, the Cloud Development team at Capgemini are proud to sponsor Devoxx UK, the leading developer conference in Britain. What does this involve? A lot more creativity than you might think!

Theme

Capgemini is a huge global company with global annual goals and missions. We are a relatively small team of 100 or so UK-based software engineers, so aligning our goals with the wider company can be something of a challenge to start with! This year, Capgemini’s purpose was a good start for us: “Unleashing human energy through technology for an inclusive and sustainable future”. The Capgemini brand platform of “Get the future you want” is also good for Devoxx - part of the reason we’re sponsoring the conference is to remind people that we’re always recruiting for new talent and can offer a great place for a software engineer to work and develop themselves.

We knew we wanted to focus on sustainability as it’s one topic close to all our hearts as well as being a Capgemini goal. But we didn’t want to make people feel depressed or personally responsible - we wanted to inspire them to make a difference. I recently attended a Capgemini “Climate Fresk” workshop, and whilst it was educational, it was mainly terrifying! The concept of a “Fresk” is that it works rather like a round table discussion, there are a number of prompt cards around environmental topics such as carbon dioxide levels, deforestation, rising sea levels, weather pattern disruption, sea water acidity levels, plant and animal diversity, CFCs, forest fires, population migration. The cards could be positioned to show which events affected which other topics, and so during the hour we built up a map of the effects of fossil fuel extraction on our planet. Most of the information I had seen before but I did learn and relearn a couple of things, for example as more CO₂ is absorbed by the ocean, the pH of the sea water rises and the sea becomes more acidic. This makes it difficult for small sea creatures to form shells, because their shells are made of alkaline calcium which dissolves in more acidic waters. These tiny creatures are the base of the food chain in the sea and so depletion in their numbers has a massive effect on the population of larger sea creatures. I came out of the fresk feeling informed but scared - not an emotion we want people to associate with Capgemini! But neither do we want to shy away from the problems that burning fossil fuels are causing. Instead, we decided to focus on the positives. This year’s TechnoVision report contains some powerful and brave messages for Capgemini employees - for example, “Do more with less”. This is very often not the easy route, and for Capgemini perhaps not the most profitable route either, it takes courage to tell your clients that the best route forward is not to build any software at all! But we need to recognise that sometimes this is the right answer. Wasteful technology is something we can all do without.

Taking that as a starting point, what about useful tech? How can we make it more carbon efficient? We know that great strides have been made towards generating carbon neutral electricity, for example from renewable energy sources such as wind, waves and sun. We know that many European countries generate quantities of electricity that way when they can - but it depends on factors such as hours of sunlight, wind speed, river levels. How can we know whether the electricity our application farms are using is generated from renewable sources? Turns out there is a way. The Energy Monitor website collects data from 27 European countries (many of which host data centres for major cloud providers such as Amazon and Azure) so that you can see, for a given point in time, which country is producing the most electricity from renewables. What if you could use this information to move your applications to the data centre using the most “green” electricity? Now that’s inspiring!

A lot of what we do as Cloud developers gives us opportunities to make electricity savings by reducing the amount of compute power we use. The fantastic advantage of infrastructure-as-code is that you can safely tear down huge proportions of your infrastructure when you’re not using it - for example, only have your build pipelines running when you actually have something to build. We are contributing to a Capgemini “Green Book” of practices that we can share with our clients, to help them reduce their carbon footprints with minimal impact to their businesses.

Swag

OK so now we have a phrase to print on our stand (“Get the future you want”), and a theme. What next? We want something to give away that is useful and reusable, that isn’t plastic, that’ll remind people of meeting us. We need some cotton T-shirts! Last year we brought along 40 or so of our Capgemini / Ada Lovelace “I am a Role Model” T-shirts and they went like hot cakes. Unfortunately the million tonnes of paper notepads and sweets that we brought along didn’t go down so well - turns out devs don’t write much and are rather healthier than we’d given them credit for! So we know that T-shirts are the way to go. But what picture can we put on them, that developers attending the conference will want to wear? Recently, Capgemini opened a new “Delivery Centre” in our Holborn office. This is a step back to teams working face to face, appreciating the value of getting people together. For the centre opening, our resident artist had drawn some fantastic images of an octopus busily multi-tasking that had got a lot of attention. Octopuses have started popping up all over the delivery centre and we thought we’d get involved! So we stole an octopus for our T-shirt.

All we needed now was some text to go with it. I turned to my team to get the best octopus puns and they didn’t disappoint.

“Be INK-redibly productive”
“Don’t be a sucker! Develop at octo-speed”
“8-bit computing”
“Octo-pushing delivery forward”
“Un-LIMB-ited potential”
“Kraken’ on with development”
“This is what beak performance looks like”
“Be an octo-coder”

OK thanks guys. Enough already. We decided that the un-LIMB-ited potential slogan fitted really well with our concept of “Get the future you want”, and so our T-shirt is a wrap! Now we can start to look forward to the talks on offer at the conference and up-skilling ourselves for the year ahead. Roll on May 10th!

Preparing for Devoxx was originally published by Capgemini at Capgemini Software Engineering on May 10, 2023.

Modern Frontends live

2023-04-14T00:00:00+01:00

It wasn’t the best conference that I have been to. A lot of the speakers complained in public about the lack of organisation. The event was not live streamed as promised, some speakers pulled out, none of it recorded and although at the Excel Centre, the location in the Excel Center was at the very end and poorly sign posted. Coffee was super gross, but I didn’t drag myself away from my comfy home office for the free lunch. The content was great. It can be hard to keep up with what is going on, unless you are knee deep in the cesspool Twitter feeds on a daily basis. Attending conferences and meetups are a great place to play catch up. I am not sure what this post pandemic world holds for conferences. It’s nice to see a slew of conferences streaming live, although, I find it too easy to not attend or pay attention when you are not physically there.

Here a list of ‘uncut gems’ and people I think are well worth a follow providing Twitter doesn’t turn into a giant sink hole. Enjoy!

Things we should stop using JavaScript for

Native browser elements

Stop using JavaScript for that: move features from JS to CSS and HTML

Kilian Valkhof

Using native colour picker and use <input type="color"> instead.

I never thought I would see a ‘pop up’ become native after Google gave the idea the axe because it takes the user away from the content. The modal or dialogue lost favour from a UI perspective. Things change, so it’s always a good thing to know that this native colour picker exists. I can certainly see this native feature for browser based software. HTMLDialogElement

CSS How Does CSS Work?

Elad Shechter

Reinventing the reset.

CSS all

all: unset;

Un-sets everything except the Unicode setting. No more overriding overrides of overrides. I wish I had this when I was forced to use Bootstrap for all site development.

Without using before and after elements, using all:unset to build a custom checkbox that looks more like a mobile UI interface.

display: revert;

Rolls back the cascaded value to the user agent’s default style.

Using this as a launch point may I introduce the new CSS reset. Like Reset and Normalise, CSS reset does a similar thing with half the code. We no longer need to specifically override every property.

If you want to find out more about where it came from and how it’s put together read The new CSS Reset

The Four Principles of Accessibility

Homer Gaines, Certified #A11y Professional

Twitter accessibility communities

Refreshing perspective from across the pond. You can watch the presentation on YouTube.

Tab index should never be less than 1. Negative numbers are not reachable with keyboard navigation but can be useful should you wish to make this element visible on another action.

HTML Semantics can sometimes be a pain. Design may want to give weight to copy when semantically it does not make sense and can hurt your SEO and page rankings not mention make a mess of your accessibility and CSS overhead.

Welcome ARIA presentation. This means that these elements and all their children are not exposed to assistive technologies.

Beyond the browser – how to talk with robots

Nico Martin

It is now possible to use the browsers’ native Bluetooth API to connect with things to do stuff.

Nico used the browser to communicate via Bluetooth to his raspberry pi driven LED matrix. The feature is experimental and needs to have the flag turned on, but this is a good indicator of the direction of how the internet of things may interact with the browser, or any browser.

The Web’s Next Transition

Kent C. Dodds

If you haven’t heard of Kent C Dodds, what rock have you been living under? This next bit makes me feel like we’re going back to table design with CSS grid.

Progressively Enhanced Single Page Apps (PESPA)

This is the idea that we progressively enhance our web pages to work with JS. So where there is no JS… we show a submit button, if JS is available, use the fetch API.

Remix is a web stack that follows this model. I can’t say I am not a fan, even if JS is pretty much ubiquitous.

Find out more from Kent’s article.

Gotta Go FAST: Use Web Components with FAST

Waller Goble

I love Web Components, it’s a standard that is native to the browser, but like most new browser based technologies, is still not fully supported by browsers that are not evergreen (browsers that are automatically upgraded to future versions, rather than being updated by distribution of new versions from the manufacturer). Web Component base frameworks I am familiar with, include, Lit (Google) which is a polyfill to help bridge this gap, Salesforce Lightning, which does that same thing under the hood. If you are familiar with the Big Commerce platform, it too uses Web Components Stencil for most it’s components. FAST, is the new kid on the block and a nice surprise. It’s also a surprise that this library has been developed by Microsoft.

Find out more about Web Components on MDN and its web component libraries available.

Having been burnt by ActionScript as a Flash development, I worry that as Meta and Facebook become less popular platforms, Meta and it’s community support for React might fade and I would rather invest in tech that is browser standard… Like JS, it’s not perfect, but everyone uses it. Having said this, because web components are standard, they can also be integrated with your preferred framework.

What my browser can do

Francesco Leardini

The Page Visibility API provides events you can watch for to know when a document becomes visible or hidden, as well as features to look at the current visibility state of the page.

Instead of sniffing for unreliable devices that date, have a look at the Media Capabilities API

The WakeLock API prevents device screens from dimming or locking when an application needs to keep running.

This is done by reading luminance around the host application with the Ambient light sensor.

File system access API only available with HTTPS. I see applications for this in combination with Web Machine Learning. keeping personal information local, secure and decentralised.

Native Web share will allow users to share data with other sites without the need for horrible 3rd party APIs.

Access to contacts with the Contact Picker API.

WebCodecs API individual frames of a video stream and chunks of audio Enables hardware encoders/decoders via a WebCodec API. Codecs have always been a pain having to accommodate for different proprietary codecs only available to users of certain OS or software, so this is a nice advancement. Video codecs supported are AV1, AVC1, VP8, VP9 and HEVC.

See Francesco Leardini’s GitHub project for examples.

Beyond the web of today

Kenneth Christiansen

Web apps should be able to do anything iOS, Android, or desktop apps can. The members of the cross-company Capabilities Project want to make it possible for you to build and deliver apps on the open web that have never been possible before. See Fugu API Tracker for current and future API in development for the browser For fun, checkout VS Code for the Web.

Some of the features that are worth note in browser technology. Web Assembly and WASM – Web Assembly is a new type of code that can be run in modern web browsers — it is a low-level assembly-like language with a compact binary format that runs with near-native performance.

WebGL has been around for a while but is very much underutilised. As browsers are increasingly evergreen, using webGL is becoming more viable. WebGL (Web Graphics Library) is a JavaScript API for rendering high-performance interactive 3D and 2D graphics within any compatible web browser without the use of plug-ins. Kinda makes me angry knowing what Flash was able to do 20 years ago!

While I have been aware of WebGL I was not aware of WebGPU, is the working name for a future web standard and JavaScript API for accelerated graphics and compute, aiming to provide “modern 3D graphics and computation capabilities”

Web Machine Learning. Arguably the most exciting thing coming to the future of the browser, or is it here now. Web Neural Network brings machine learning to the browser natively. Some interesting uses include machine translation, detecting fake video, facial recognition and emotion analysis. There are a lot of ethical considerations for this technology, and its uses, however, the there are some very good use cases for assistive technology and accessibility for disabilities.

Overall, conferences are good to find out where you would like to get your toes wet and there is no better time than now to start playing around and learning a new skill.

I think that if AI is not outlawed outright, using browser technology to enhance if not completely automate user journeys and interactions is on the cards. How will the web interact with virtual assistants to make holiday bookings and cobots (collaborative robots) to provide medication in our senior years? The future is definitely interesting. Let’s see.

Modern Frontends live was originally published by Capgemini at Capgemini Software Engineering on April 14, 2023.

Consumer Driven Contract Testing with Pact, Kafka and Spring Boot

2023-01-06T00:00:00+00:00

The Problem

When developing microservices within a distributed system there is a need to ensure that where services communicate with one another, both the providing and consuming services understand what the other expects.

A common solution to this problem is end to end integration testing, where services being tested are deployed into a production-like environment at the same time and real usage scenarios are executed. This allows a relatively high level of confidence that the system and its components work together as expected, however this method of testing has the following drawbacks:

It is a slow process - tests often don’t run in parallel, co-ordination between the teams developing the services can be long winded, teams may be lagging behind others in feature completion so testing can’t be fully representative.
Tests are fragile and are hard to debug - due to so many moving parts, test starters, env configurations, different app versions etc. tests are very brittle and difficult to debug efficiently.

Consumer Driven Contract Testing

Consumer driven contract testing is an alternative approach to end to end testing, the focus is on a single component and its integration boundaries at a time. The responsibility for defining the contract which needs to be adhered to is placed on the consumer. This approach alleviates many of the issues with end to end testing mentioned above:

Faster - services don’t need to be deployed and can run locally or in a build pipeline so feedback on breaking changes is much faster.
Simpler more reliable deployments - removes the need for complicated release coordination and dependencies between teams.
Allows you to know statically at release time which services are compatible

Pact, Spring Boot and Kafka

Pact is a popular open source consumer driven contract testing library. It is usually used in the context of testing between APIs and clients. However, pact can also be used to test asynchronous event driven systems. The steps for this are as follows:

Test the consumer and capture the contract by using a mock provided by pact. The mock checks the consumer can successfully invoke the message handler and can successfully process the event.
All the contracts are serialised and loaded into a pact broker
Pact pulls all the consumer contracts from the pact broker, then replays them against the provider. The test verifies the provider can produce the right messages for each consumer by checking that the message structure matches what is defined in the consumer contract.

Implementation Example

In the following example, we will create a simple NBA (National Basketball Association) contract themed Spring Boot, Maven, JUnit 5 application which will implement a Kafka consumer that will generate a contract. Following which we will define a producer and see how, using Pact we can ensure that the contract between the two services is upheld. We’ll start with ensuring we have the correct dependencies.

Consumer

This consumer will listen on the specified topic for events when a new NBA player signs a contract and then generate a headline that will be logged with specific contract details pulled from the Kafka message.

@Component
@RequiredArgsConstructor
public class PlayerContractListener {
    private final Logger logger = LoggerFactory.getLogger(PlayerContractListener.class);
    private final HeadlineGenerator headlineGenerator;

    @KafkaListener(id = "demo", topics = "contract-details")
    public void listen(@Payload ContractDetails details) {
        logger.info("Contract consumed from topic!");
        logger.info(headlineGenerator.generateHeadLine(details));
    }
}

The pact unit test implementation for this listener is as follows:

@ExtendWith(value = {PactConsumerTestExt.class, MockitoExtension.class})
@PactTestFor(providerName = "playerContractProducer", providerType = ProviderType.ASYNCH, pactVersion = PactSpecVersion.V3)
class PlayerContractListenerTest {

    private static final String JSON_CONTENT_TYPE = "application/json";
    private static final String KEY_CONTENT_TYPE = "contentType";

    @Mock
    private HeadlineGenerator headlineGenerator;
    @InjectMocks
    private PlayerContractListener playerContractListener;

    @Pact(consumer = "playerContractConsumer")
    MessagePact contractDetailPact(MessagePactBuilder builder) {
        PactDslJsonBody jsonBody = new PactDslJsonBody();

        jsonBody.stringType("documentType", "contract")
                .stringType("firstName", "LeBron")
                .stringType("lastName", "James")
                .stringType("team", "LA Lakers")
                .stringType("duration", "5 years")
                .stringType("salary", "158 million USD");

        return builder.expectsToReceive("A player contract")
                .withMetadata(Map.of(JSON_CONTENT_TYPE, KEY_CONTENT_TYPE))
                .withContent(jsonBody)
                .toPact();
    }

    @Test
    @PactTestFor(pactMethod = "contractDetailPact", providerType = ProviderType.ASYNCH)
    void successfullyGenerateHeadlineGivenValidMessage(List<Message> messages) {
        ContractDetails contractDetails = ContractDetails.builder()
                .documentType("contract")
                .firstName("LeBron")
                .lastName("James")
                .team("LA Lakers")
                .duration("5 years")
                .salary("158 million USD")
                .build();

        when(headlineGenerator.generateHeadLine(contractDetails)).thenReturn("A new headline");

        messages.forEach(message -> {
            assertDoesNotThrow(() -> playerContractListener.listen(
                    new ObjectMapper().readValue(message.contentsAsBytes(), ContractDetails.class)));

            verify(headlineGenerator, times(1)).generateHeadLine(contractDetails);
        });
    }
}

@ExtendWith allows us to specify both the PactConsumerTestExt and the MockitoExtension to initialise our Mockito/Pact annotations.

@PactTestFor at the class level allows us to specify the providerName, this value is important as it will need to match the name used when we build the provider tests. providerType indicates that this is a test for an asynchronous system and the pactVersion allows us to declare the Pact version (V3 in this case).

@Pact is where we specify the name of our consumer, again it is important to ensure this matches the name that is given to the provider side of the Pact test. In the pact method itself we are able to use the PactDslJsonBody to define the structure of our contract.

@PactTestFor on our test method is where we tell Pact that the contractDetailPact method will provide the messages we want to test against our consumer method to ensure that it is able to process the message structure as expected. In this example we are using a default ByteArrayDeserializer from the Apache Kafka library for message deserialisation. For brevity we are using an object mapper to mimic the deserialisation of the message from bytes. If you are using a custom deserialiser you can use that code to deserialise the message to ensure that your deserialiser can also handle the structure of the Pact message defined.

Once the test is run and the message was successfully processed by our consumer, a Pact contract is generated and stored in our target/pacts directory by default. Once this is complete we can use the following maven command mvn pact:publish to publish our contract to our Pact broker, where it will be verified against our producer to ensure that the messages it produces are what we expect.

{
  "consumer": {
    "name": "playerContractConsumer"
  },
  "messages": [
    {
      "contents": {
        "documentType": "contract",
        "duration": "5 years",
        "firstName": "LeBron",
        "lastName": "James",
        "salary": "158 million USD",
        "team": "LA Lakers"
      },
      "description": "A player contract",
      "matchingRules": {
        "body": {
          "$.documentType": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          },
          "$.duration": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          },
          "$.firstName": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          },
          "$.lastName": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          },
          "$.salary": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          },
          "$.team": {
            "combine": "AND",
            "matchers": [
              {
                "match": "type"
              }
            ]
          }
        }
      },
      "metaData": {
        "contentType": "application/json"
      }
    }
  ],
  "metadata": {
    "pact-jvm": {
      "version": "4.3.13"
    },
    "pactSpecification": {
      "version": "3.0.0"
    }
  },
  "provider": {
    "name": "playerContractProducer"
  }
}

Provider

The producer contains a simple rest endpoint that takes a new player contract as a request body.

@RestController
public class PlayerContractController {

    @Autowired
    private PlayerContractProducer playerContractProducer;

    @PostMapping("/sign")
    public void createDraftContract(@RequestBody PlayerContract contract) {
        playerContractProducer.send(contract);
    }
}

The playerContract object is mapped into a ContractDetails object and sent to the specified topic using a default Kafka template.

@Component
@RequiredArgsConstructor
public class PlayerContractProducer {
    private final KafkaTemplate<String, ContractDetails> template;
    private final PlayerContractMapper contractMapper;
    private final Logger logger = LoggerFactory.getLogger(PlayerContractProducer.class);

    public void sendContractDetails(PlayerContract playerContract) {
        template.send("contract-details", contractMapper.mapContractDetails(playerContract));
        logger.info("Contract produced to topic!");
    }
}

The test for the producer is implemented as below:

@Provider("playerContractProducer")
@Consumer("playerContractConsumer")
@PactBroker(url = "http://localhost:9292")
class PlayerContractMapperTest {

    private static final String JSON_CONTENT_TYPE = "application/json";
    private static final String KEY_CONTENT_TYPE = "contentType";
    private final PlayerContractMapper contractMapper = new PlayerContractMapper();

    @BeforeEach
    void before(PactVerificationContext context) {
        context.setTarget(new MessageTestTarget());
    }

    @TestTemplate
    @ExtendWith(PactVerificationInvocationContextProvider.class)
    void pactVerificationTestTemplate(PactVerificationContext context) {
        context.verifyInteraction();
    }

    @PactVerifyProvider("A player contract")
    MessageAndMetadata verifyMessage() {

        PlayerContract playerContract = PlayerContract.builder()
                .age(37)
                .dateSigned(LocalDate.of(2022, 4, 3))
                .documentType("contract")
                .firstName("LeBron")
                .lastName("James")
                .team("LA Lakers")
                .position("Power Forward")
                .duration("5 years")
                .salary("158 million USD")
                .build();

        JsonSerializer<ContractDetails> serializer = new JsonSerializer<>();

        return new MessageAndMetadata(serializer.serialize("kafka-pact", contractMapper.mapContractDetails(playerContract)),
                Map.of(KEY_CONTENT_TYPE, JSON_CONTENT_TYPE));
    }
}

@Provider lets us tell Pact the name of our provider which should match whatever was specified in the consumer test.

@Consumer lets us tell Pact the name of the specified consumer we are testing against, again it must match what we specified in the consumer test.

@PactBroker is where we specify the url of our Pact broker where our consumer contract is stored.

Since it is actually the responsibility of the ContractMapper within our project to ensure that the message is in the correct format, that is the class that we will unit test using Pact. The result of the mapContractDetails method call is then serialised and verified against the contract that was generated and published to the broker by the consumer.

In this instance it seems like we have missed off two important fields that are needed by our consumer - salary and team:

@Getter
@Builder
public class ContractDetails {
    private String documentType;
    private String firstName;
    private String lastName;
    private String duration;
}

As a result the test run failed with the following error:

1) A player contract: generates a message which has a matching body

    1.1) body: $ Actual map is missing the following keys: salary, team

        {
          "documentType": "contract",
          "duration": "5 years",
          "firstName": "LeBron",
        -  "lastName": "James",
        -  "salary": "158 million USD",
        -  "team": "LA Lakers"
        +  "lastName": "James"
        }

The result of the test failure has now been published to the pact broker:

Once we update our code with the missing fields and retest:

@Getter
@Builder
public class ContractDetails {
    private String documentType;
    private String firstName;
    private String lastName;
    private String team;
    private String duration;
    private String salary;
}

We now have a passing build, giving us confidence to push our producer code knowing that it does not contain any breaking changes for our consumer:

All this was done locally without having to deploy both our consumer and producer into an environment, saving us from a lot of wasted time and effort.

Consumer Driven Contract Testing with Pact, Kafka and Spring Boot was originally published by Capgemini at Capgemini Software Engineering on January 06, 2023.

Platforms to support Product Oriented Teams

2022-12-02T00:00:00+00:00

Gartner told us that by 2022, 80% of us would have moved to a more product-centric IT operating model. What does this mean, and more specifically, what does it mean for us as software engineering consultants?

Products - Delivering Value

First, some definitions. What do we mean by “Product”? For me, this is a very business-oriented term. Say you’re a dairy farm, your products might be milk, cheese, ice-cream. This maps to the Agile definition of “Product” as a vehicle to deliver value; as in “Product Owner” - perhaps the cheesemaker in the dairy - the person who understands how applications impact their business and is ultimately responsible for deciding which software changes are built and released by a team. And the important concepts behind product-oriented teams, for us software engineers, are twofold.

One, there is the concept of funding. For a business to fund a product rather than a project implies no end date to the funding stream, which fits in much better with agile practices, for example we only plan in detail for the next sprint, and it supports the creation of long-lived teams which can become much more efficient at delivery.

Second, there is the concept of having full focus on delivering business value. This isn’t new, and is probably the goal for all sprint teams, but anyone who’s worked in an agile team will have hit that sprint where they have to deliver tech debt - perhaps they have to patch some libraries, or restructure their data schemas to improve application performance, or do something in the DevOps space such as implement blue/green releases. The product owner is on board with the need to do these things, but isn’t really that interested in the implementation and really just wants these problems to “go away” so they can get back to more important sprint stories. Hence the emphasis on product teams focusing on value delivery and customer satisfaction.

Unicorns

You’ve probably spotted the flaw in this idea - how can the IT teams focus solely on “value delivery” and STILL provide something performant and secure? How does all the other stuff get done? What about creating delivery pipelines and test frameworks and patching strategies and scaling mechanisms? What about all the DevOps tasks such as making our app supportable and observable? This HAS to be done, but it isn’t accounted for in the job definition of product-centric teams.

Zones of Repeatability

The good news about all the important DevOps style work which our product-centric teams don’t have time to do is that whilst it’s heavyweight work, most of it is highly repeatable. This means that solutions can be shared between product teams.

Platforms - the new Platforms!

In other words, what we need is a “common platform”. This isn’t a new idea, and exists in some form or other in most companies - complete with a Platform Team in charge of creating and maintaining it. These platforms, however, are often destined for failure - and by failure, I mean they do not get used and do not provide the advantages that they promised, for a number of common reasons that we would want to avoid when defining our platform for supporting product-centric teams. I’ll list out some of the major pitfalls here

1. The Gap

You might be familiar with the idea of “Goldilocks problems” - in the story of Goldilocks and the three bears, she always tries the two extremes before she gets it right. The first porridge is far too hot, the second is far too cold. Similar things tend to happen to platform teams when they are given the remit to create a common developer platform - on the one hand, they create a platform which doesn’t do enough for the developer and as such, isn’t an accelerator and doesn’t save them any time. An example of this might be when a company first decides to build a shared platform and hires some platform engineers with a very loose remit. The engineers think of some tools the devs might like, and provision them somewhere, and create some roles and access rights, but without an understanding of what the developers do (or want to do) this isn’t going to be usable.

2. Should vs Could

Fellow blogger Chris Burns came up with this title. The concept comes from the speech by Dr. Ian Malcolm in Jurassic Park - “your scientists were so preoccupied with whether or not they could, that they didn’t stop to think if they should”. This might be what happens when you throw money and resources at a platform problem - you hire the best DevOps engineers, they have incredible skills with tools like Jenkins - they can script their way around a whole bunch of issues and failures, but again, they’re not focussed on the right problem. Their remit is still too loose and they haven’t got an eye on who their customer is, and because customer satisfaction isn’t set as top of their priorities, they’ll create a monster that is frankly unusable by the dev team and quite possibly also a security hazard. We’ve seen examples of this where the developer requests a job to tear down an environment, for example. Perhaps they have a bad test which relies on certain data in the database - the solution to this is to rewrite the test. The platform team’s solution might be to create some heinous post-cleanup script to reintroduce the required data before the test runs. Yes you can do that - but it’s not the right answer!

3. Devs in Chains

This pitfall is at the reverse end of the “Goldilocks Problem” - rather than having a platform that is too loosely defined, you create one that is too locked down. It might be very efficient at what it does, but if it doesn’t do what the developers need it’s still a failure because they won’t use it. For example, perhaps you provide build pipelines without the ability to edit what they do. They may deploy containers very efficiently; but what if you suddenly want to add a serverless function to your architecture? What if you want to run a different set of tests? The platform should not dictate what the dev teams can do to this level. Such restrictive platforms could also result in security issues. Developers are an extremely creative bunch, if you try to lock down access to the databases, for example, some bright spark might realise that if they simply add an ‘apt get install postgres’ into the Dockerfile of the Java applications deployed into their Kubernetes cluster, they could then ssh into the pod when it was deployed and use the PostgreSQL client installed into the container as a “back door” to access the database. Argh!

A New Mindset

So what’s the solution? We as developers know it very well. When we’re building applications for our product owners, we work in an agile way - we create friendly “user stories” and break them down into tasks that can be delivered quickly, we build a minimum viable product and get it out for some user feedback and then we iteratively improve from there. So why on earth don’t we do that when we build platforms?! Why do we hire a bunch of highly talented platform engineers and hide them away behind a service desk interface, creating the “Dev/Ops gap” we have been trying so hard to break down all these years? All that we need is for the platform team to reverse its mindset. To remember that the developer is the customer in this scenario, and in the same way as IT customers the whole world over, they do not really know what they want.

Needs versus Requirements

If a platform engineer adopts this customer-facing mindset and sits down with a developer to list out what they want, the developer might say “I want access to the production servers and all the databases”. Our engineer needs to be ready here to translate this into what they actually need, which is the ability to deploy applications, to observe application behaviour, and to make changes when necessary. This isn’t quite how the dev has phrased it! So we need to bring to the table all the skills from agile methodologies and also from practices such as domain-driven design to make sure we are getting our customer requirements right.

Build an MVP

When you look at the CNCF Cloud Native landscape it’s easy to become overwhelmed as to how you are going to build a platform which covers off all of these boxes. Of course, the way to break down this complexity is to start with a Minimum Viable Platform the same way we would build an MVP for a complex application. Figure out what you need first, figure out where the risk is, get those bits in place and working and iteratively improve from there.

The Paved Road

The secret to not being too restrictive is to follow Netflix’s example of building a “paved road” across the CNCF landscape without locking dev teams down to a certain path. For example, we know we will need some kind of pipeline automation software to run builds and other deployment-related jobs. But which one to use? It doesn’t really matter - just make sure that your platform is built in a sufficiently layered, pluggable way and then put in a suggested tool - say, Tekton chains - and if there is one team who REALLY want to use Concourse for some reason, they are welcome to configure the platform and change the pipeline tool. The “paved road” is there as an accelerator for teams who don’t have an opinion on which pipeline to use, but they are not forced to use this route.

Our Opinionated Stack

The Cloud Development team at Capgemini have created our own “Paved Road” through the CNCF landscape. We’ve called it CREATE - the Cloud Ready Environment for Application Test and Execution. It takes the principles of zero trust; customer centricity; automation first; separation of concerns. It will be open-sourced and uses mainly open source components. We’ve assumed that the cloud platform will be Kubernetes-as-a-service, as this is a good abstraction from specific cloud vendors whilst allowing scale-to-zero for maximum compute efficiency. We have defined the necessary pods for both tooling and deployment using Terraform and Helm. We’ve separated out Continuous Integration from Continuous Delivery, with separate pipelines and separate permissions for each workflow, and we add Harbor as a place where built artifacts can be interrogated, signed and stored securely.

We’ve used the wonderful Backstage portal from Spotify to build our developer interface - it’s so much more friendly and intuitive than trying a “square peg/ round hole” approach of building a portal out of service desk or issue tracking software.

The remit of CREATE is to provide a catalogue of template applications - a React app, for example, and a Spring Boot app - from which the developer can choose what best fits their need. It will then manage the whole GitOps workflow - creating in the cloud the infrastructure for running the build (pipelines, quality analysis tools etc) and also the infrastructure for hosting the application. It provides a template to deploy useful peripherals such as a credentials vault, authentication, monitoring/logging/tracing tools. We don’t mind which ones - we provide a default, but they’re pluggable; we’re more attached to the principles than the tooling choice itself. CREATE will then set up a repository for the template application, build and deploy it, and leave the build pipeline waiting for changes. It will automatically auto-scale based on load. And all this in under an hour. Yes it has its limitations - it’s designed for containerised applications hosted on Kubernetes - but anyone who’s worked on applications in this space will appreciate just how much time and effort it will save.

If you’d like to find out more about CREATE, and receive notification of when we release our MVP, please get in touch with myself or Chris Burns via Linkedin.

Platforms to support Product Oriented Teams was originally published by Capgemini at Capgemini Software Engineering on December 02, 2022.

Reference panel in Dynamics 365

2022-07-15T00:00:00+01:00

Introduction

The reference panel is a valuable feature in Dynamics 365. Subgrids generally occupy a good amount of space in a form. When we need to add multiple subgrids in the same form, it not only impacts the performance, the UI also doesn’t look good for the users. The reference panel overcomes this issue.

What is the reference panel?

As per MSDN, a reference panel is a single column section. You can insert subgrids, quick view control, or a knowledge base search control inside a reference panel section. Each control that you added in the reference panel appears as a vertical tab within the panel at runtime. You can drag and drop the various controls within the reference panel section. The default tab at the runtime is the first control added to the reference panel. The other tabs appear in the order in which they are added in the form editor. To delete a tab, use the delete key on your keyboard.

When you insert a reference panel, by default, it’s added as the last section in the tab. You can add only one reference panel per form.

Steps to create

As a system administrator, add your entity (account in this scenario) to a solution.
Open the entity form on which you wanted to insert the reference panel.
Go to Insert -> Section -> Reference Panel
Once you click on the reference panel, a section will be created as the last section in the tab.
Then insert the multiple subgrids as per your requirement in this section.
Then save and publish the form, you will see a subgrid with multiple buttons attached to the subgrid as shown below. By clicking on each button, the subgrid will dynamically change the views.

Hope you learn something new today.

Happy Learning.

Reference panel in Dynamics 365 was originally published by Capgemini at Capgemini Software Engineering on July 15, 2022.

Contributing to the DCX React Library

2022-05-27T00:00:00+01:00

Introduction

At Capgemini within the DCX (Digital Customer Experience) team we have built and released the first style-agnostic React component library, which provides consumers with a suite of tested React components that can be re-used within any React front end thereby speeding up the process of beginning a new project.

In a previous blog post we introduced the DCX React component library and in this blog post I will be explaining the process by which we created the library and how you can contribute to the ever-growing list of React components.

At the time of writing this post, a 0.4 version of the library was released to the public npm registry where the full suite of currently available React components can be viewed within the storybook.

Assumption

As I will be explaining the process by which we created a React component library, it is assumed that you are familiar with JavaScript, React and TypeScript.

Stage 1: Creating the library

Based on the growing popularity of yarn over the last 5 years and the performance benefits over the npm package manager we decided to use this as our package manager.

During our initial research for tools to help build the component library we noticed that the React ecosystem for building web applications was quite saturated with tools like Create React App (CRA), Next.js and Remix but the options for building a React component library were limited.

To create the library, we opted to use microbundle, which is a “zero-configuration bundler for tiny modules”.

The microbundle is a wrapper around rollup with predefined defaults such as minification and compression that produces nicely formatted stats, multiple target formats, ES modules, CommonJS and UMD.

The bundle sizes are small because we made a conscious effort to avoid importing external libraries, thereby minimising our dependency on external code, however in some cases to avoid reinventing the wheel we selectively imported libraries like lodash with a small number of external dependencies to reduce our exposure to vulnerabilities.

As you can see below, the library’s bundles are small and available in multiple targets:

kB: dcx-react-library.js.gz
9 kB: dcx-react-library.js.br
34 kB: dcx-react-library.modern.js.gz
36 kB: dcx-react-library.modern.js.br
92 kB: dcx-react-library.module.js.gz
86 kB: dcx-react-library.module.js.br
1 kB: dcx-react-library.umd.js.gz
97 kB: dcx-react-library.umd.js.br

More importantly because of the benefits of type scripting we were mostly interested in the “out of the box” support it has for TypeScript.

To configure the microbundle we added the following properties within the project’s package.json file to specify where the input files are to come from, the location where the output bundles are to be placed and the directory from where the TypeScript types are placed.

{
  "source": "src/index.ts",
  "main": "dist/dcx-react-library.js",
  "module": "dist/dcx-react-library.module.js",
  "unpkg": "dist/dcx-react-library.umd.js",
  "typings": "dist/index.d.ts",
  "files": [
    "dist"
  ]
}

Stage 2: Setting up the library

To ensure that we build the library to a high standard there were several tools and processes we put in place to aid our efforts.

Firstly, we added a .gitignore file to exclude the generated file from our remote repository.

  *.log
  .DS_Store
  node_modules
  .cache
  .idea
  dist
  coverage
  .parcel-cache
  example/.parcel-cache/*
  example/build
  storybook-static

Secondly, we added both an .eslintrc.json and an .eslintignore file to enforce a number of rules to guide the standard of written code and to specify which directories should not be linted.

Thirdly, we added Jest, the testing framework maintained by Meta (formerly Facebook), to the project to aid our test-driven development.

As a precedence, to ensure the reliability of the application code we specified that all branches, functions, lines, and statements should be at 100% within the library. As it stands, we currently have 100% test coverage on the application code.

Below is the current configuration specified in the project’s jest.config.ts file:

module.exports = {
  preset: 'ts-jest',
  testEnvironment: 'jsdom',
  collectCoverage: true,
  coverageReporters: ['json', 'lcov', 'text', 'html'],
  coverageThreshold: {
    'global': {
      'branches': 100,
      'functions': 100,
      'lines': 100,
      'statements': 100
    }
  }
}

Our components have been unit tested using the React Test Library created by Kent C. Dodds

To standardise the format of the commit messages we turned to commitlint, a linter for commit messages.

Commitlint ensures that messages MUST be prefixed with one of the following depending on the content of the commit.

[build, chore, ci, docs, feat, fix, perf, refactor, revert, style, test] [type-enum]

git commit -m "build: {{ name of build config change }}"
git commit -m "feat: {{ name of feature }}"
git commit -m "fix: {{ name of bug }}"
git commit -m "test: {{ name of test }}"

Finally, as well as the above we added two web hooks to the project to run when contributors commit new code and push code upstream to the remote repository using husky

pre-commit: used to lint the content within the commit, if any lint errors are found then the commit will fail
pre-push: used to start a full jest test run and check for 100% test coverage, if any tests are broken or test coverage is below the set 100% configuration then the push will fail.

Stage 3: CI/CD Set up for the library

To ensure consistency in the application code on the remote branch we decided on using circleci within the git workflow to automate the continuous integration. This tool allowed us to maintain the integrity of the combined content within the main and release branches, which integrates directly with the host git repository.

The automated builds were configured with the following config.yml:

version: 2
jobs:
  build:
    docker:
      - image: circleci/node:12.22.0
    working_directory: ~/repo
    steps:
      - checkout
      # Download and cache dependencies
      - restore_cache:
          keys:
            - v1-dependencies-{{ checksum "package.json" }}
            # fallback to using the latest cache if no exact match is found
            - v1-dependencies-
      - run: yarn install
      - save_cache:
          paths:
            - node_modules
          key: v1-dependencies-{{ checksum "package.json" }}
      # run lint
      - run: yarn lint
      # run tests!
      - run: yarn test --runInBand --logHeapUsage && ./node_modules/.bin/codecov
      # deploy storybook
      - run: yarn chromatic --project-token=c6317a751fef --auto-accept-changes

Stage 4: Storybook Documentation

As with most component libraries, we created a Capgemini themed storybook which showcases all the implemented components. It includes a detailed description of each component, including a list of required and optional properties, example styled components and a live preview which gives consumers the ability to play with all the components by editing default props.

Documentation for all components are held in .mdx files within the \stories directory.

We have also used a series of addons to build the storybook.

module.exports = {
  stories: ['../stories/**/*.stories.@(mdx)'],
  addons: [
      '@storybook/addon-links', 
      '@storybook/addon-essentials', 
      '@storybook/addon-docs', 
      '@storybook/addon-controls', 
      'storybook-css-modules-preset',
      '@storybook/addon-a11y'
    ],
};

Stage 5: Contributing to the React DCX library

Introduction

The following is an abstract directory tree of the DCX React Library repository with a single component named ComponentName

dcx-react-library
├── example/
│   ├── src/
│   │    ├── components/ # add example usage of component
│   │    │   └── ComponentNameDemo.tsx
│   │    └── index.tsx
├── src/
│   ├── componentName/ # the actual component itself
│   │    ├── __test__/
│   │    │   └── ComponentName.test.tsx
│   │    ├── ComponentName.tsx 
│   │    └── index.ts
│   └── index.ts
├── static/
├── stories/ # the story demo for component
│   ├── ComponentName/
│   │    ├── Documentation.stories.mdx
│   │    ├── Live.stories.mdx
│   │    ├── Styled.stories.mdx
│   │    └── Unstyled.stories.mdx
│   ├── liveEdit
│   │    └── ComponentNameLive.tsx # the editable render of the component
│   ├── Introduction.stories.mdx
│   └── style.css # styles used within the stories of the components
├── .eslintignore
├── .eslintrc.json
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── jest.config.ts
├── LICENSE
├── netlify.toml
├── package.json
├── README.md
├── setup.sh
├── tsconfig.json
└── yarn.lock

Now the fun part, contributing to the library. This can be done in a few ways:

Adding a new component
Updating documentation
Enhancing an existing component
Resolving bugs
Improving accessibility

Adding a new component

The first step would be to decide what kind of component you would like to add, what functionalities the component should support based on consumer needs and how the component can broaden the breadth of the library.

After deciding on the above the first thing to do will be to add a directory for your component within the src directory.

Add a .tsx file for your component with a file name matching the name of your component e.g. ComponentName.tsx

Within your ComponentName.tsx add an initial export, for example

type ComponentNameProps = {
  /**
   * a property for Component Name
   */
  componentProperty: any;
}

export const ComponentName = ({ componentProperty }: ComponentNameProps) => {
   // implementation will go here

  return (
    // render component code will go here
  );
};

To export the component for use, the newly added component will need to be added to the component’s index.ts file, for example

export { ComponentName } from './ComponentName';

then within the src/index.ts file the full list of exports within the componentName directory can be exported by adding the following.

export * from './componentName';

Once complete with the above you can start the implementation by adding unit tests for the component within a ComponentName.test.tsx test file in the src/ComponentName/__test__/ directory.

Using the React Testing Library you will now be in a position to write a test, for example

import React from 'react';
import { render, screen } from '@testing-library/react';
import '@testing-library/jest-dom';
import { ComponentName } from '../ComponentName';

describe('ComponentName', () => {
  it('should render', () => {
    render(<ComponentName componentProperty="some-property" />);

    expect(screen..getByText('some-property')).toBeInTheDocument();
  });
});

Once you’ve added a feature to your ComponentName you’ll be able to add the component to a ComponentNameDemo.tsx file that can be added to the example/src/components/ directory, as follows:

import React from 'react';
import { ComponentName } from '@capgeminiuk/dcx-react-library';

export const ComponentNameDemo = () => {
  return (
    <>
      <h1>Demo of ComponentName</h1>
      <ComponentName
        componentProperty="some-property"
      />
    </>
  )
}

The ComponentNameDemo will then need to be added to the example/src/index.tsx file for it to be present within the example app front end, for example:

import { ComponentNameDemo } from './components';

const App = () => (
  <div>
    <BrowserRouter>
      <Switch>
        <Route path="/componentName" exact component={ComponentNameDemo} />
      </Switch>
    </BrowserRouter>
  </div>
);

ReactDOM.render(<App />, document.getElementById('root'));

In a terminal run the following to link the changes within the DCX React Library to the example folder

yarn

then

cd example

finally

../setup.sh

this will also open http://localhost:3000 in your default browser

Updating storybook documentation

Now that we’ve added our ComponentName we will now need to create a set of stories for our component.

There are four .mdx story files that we have for each component, which are:

Documentation.stories.mdx

A file to add the general description of the component along with a full list of the props the component has.

import { Meta, Story, Canvas, Props } from '@storybook/addon-docs/blocks';
import { ComponentName } from '../../src/componentName/ComponentName';

<Meta
  title="DCXLibrary/ComponentName/documentation"
  component={ComponentName}
  parameters={{
    viewMode: 'docs',
    previewTabs: {
      canvas: { hidden: true },
    },
  }}
/>

// ComponentName can be added here

// Here is where a general description of the component can be added

// Usage example added here

<ComponentName 
  componentProperty="some-property" 
/>

<Props of={ComponentName} />

Live.stories.mdx

A file to add a live edit of the newly created component, which offers consumers of the library a place to edit the component to observe how it renders with a specific set of props, for example:

import { Meta, Story, Canvas, Props } from '@storybook/addon-docs/blocks';
import { ComponentName } from '../../src/componentName/ComponentName';
import ComponentNameLive from '../liveEdit/ComponentNameLive';

<Meta
  title="DCXLibrary/Form/ComponentName/live"
  component={ComponentName}
  parameters={{
    viewMode: 'docs',
    previewTabs: {
      canvas: { hidden: true },
    },
  }}
/>

// ComponentName

// In the live editor you can play with all the available properties
// change the look and feel and interact with the component

<Canvas>
  <Story name="live">
    <ComponentNameLive />
  </Story>
</Canvas>

// Properties

// below are described the list of all available properties.
// the one marked with (\*) are mandatory the other instead are optional.

<Props of={ComponentName} />

Before creating the above, you will need to ensure that you have created the liveEdit/ComponentNameLive.tsx module, an example of this is:

import React from 'react';
import { LiveProvider, LiveEditor, LiveError, LivePreview } from 'react-live';
import { ComponentName } from '../../src/componentName/ComponentName';

const ComponentNameDemo = `
function ComponentNameDemo() {

  return (
    <ComponentNameDemo 
      componentProperty="some-property" 
    />
  )
}
`.trim();

const ComponentNameLive = () => {
  const scope = { ComponentName };
  return (
    <LiveProvider code={ComponentNameDemo} scope={scope}>
      <div className="container">
        <LiveEditor className="liveEditor" aria-label="editor" />
        <LivePreview className="livePreview" aria-label="preview" />
      </div>
      <LiveError className="liveError" aria-label="error" />
    </LiveProvider>
  );
};

export default ComponentNameLive;

Styled.stories.mdx

A file where styled stories of ComponentName can be added, all styles can be added to the stories/style.css file.

For scoped styles, a style.css file can be created within the ComponentName stories directory, with specific styles added here i.e., stories/ComponentName/styles.css

Unstyled.stories.mdx

A file where un-styled stories of ComponentName can be added, these stories will contain basic usage of the component with no styles applied.

Enhancing an existing component

Within the project’s GitHub project board we have a number of enhancements that we want to implement for the upcoming 0.5 release with details of the desired changes, below is a snippet of a previous enhancement made to the FormSelect component in the 0.4 release.

Currently when you want to pass the options you need to specify the value and the label.

  options={[{
    label: 'Recently published',
    value: 'published'
  }]}

This is perfect in cases where it is not necessary to have a different label from the value.

 
  options={['a', 'b', 'c']}
 

When making such changes it is important to ensure we do not remove or break any pre-existing functionality, but we need to add extra properties to support desired enhancements.

Resolving bugs

As the number of consumers of the library increases, we may find bugs within the implemented components, we encourage consumers to raise bugs on the project’s GitHub list of bugs page.

Participating in this way offers the opportunity to investigate issues to find a solution, which often is the best way to learn a new codebase.

Improving accessibility

All components are tested for accessibility and as we grow the list of components, we aim to ensure that all components meet WCAG 2.0 accessibility standards.

If any accessibility bugs are found, we encourage consumers to raise issues on the project’s GitHub list of accessibility issues.

Thinking of contributing?

If you would like to know more about the library, feel free to contact Daniele Zurico or Isaac Babalola.
If you are interested in using the library, it is now publicly available on npm.
If you would like to contribute, you can do so by forking the public repository.
If you would like to familiarise yourself with all of the built components, please take a look at the storybook documentation.

Contributing to the DCX React Library was originally published by Capgemini at Capgemini Software Engineering on May 27, 2022.

The Efficient Cloud Era

2022-05-20T00:00:00+01:00

The main theme at Devoxx UK this year was all about getting Java to be fast and lean in the cloud.

From improving startup time to allow serverless Java apps, to enabling scale-to-zero, to ensuring your application is running efficiently, many of the talks at the Devoxx UK 2022 conference were really focussing on a net outcome of reducing cost, which in turn will reduce power requirements for IT estates and aid sustainability.

Back to the Buzz

After a nervous 2-day hybrid conference last year, Devoxx was back to its buzzing self this year with speakers delighted to be back in front of audiences and the sponsors’ booths once again bursting with free gifts and chat. Capgemini were gold sponsors this year, with engineers Kevin Rudland and Chris Burns giving their talk “How to get Hacker Kids to Max Out your AWS account in 10 hours, and other reasons to focus on your Secure Software Supply Chain” (more on that later).

Our T shirt and mug freebies flew off the shelf, and our vegan, palm-oil free pick ‘n’ mix was popular in the post-lunch lull. Fantastic Capgemini AIE artist Jack Ambrose was once again on hand to help people visualise “Getting the future they want”.

Efficiency at the Fore

There isn’t time to go to all the talks at Devoxx so this article is skewed by my choices, however there were many talks around similar themes: improving the efficiency and speed of Java applications, deploying to the cloud with Kubernetes and Knative being prevalent. I’ve listed here some strong themes and great facts from the talks that I attended over the three days.

Kubernetes by Default

Everyone but everyone was talking about deploying with Kubernetes. This may have been directly or using KNative. I attended “Fantastic Java apps and how to kubefy them with Dekorate”, a live coding demo showing how the Dekorate annotations could be used to generate your Kubernetes manifest files, allowing Java devs to reduce the number of languages and syntaxes they need to get their heads around to create a Kubernetes runtime. Our Capgemini talk also suggests Kubernetes-as-a-service as the best abstraction layer between your own deployment artifacts (ie containers) and your cloud provider platform. There were talks specifically focussing on improving the sustainability of Kubernetes clusters using schedulers - for example “Sustainability in software engineering - today and tomorrow”. In this talk, speaker Martin Lippert refers to a 2019 report suggesting a quarter of data centre servers are “zombie servers” - running and using electricity but hosting no active applications. Unfortunately it seems the same applies to virtual mchines; suggesting that if we really want to reduce our power footprint for our estates, we NEED an auto-scaling platform such as Kubernetes, and we probably need a hyperscaler capable of managing the underlying machines when not in use.

Speed up your Start Time

There has been a real buzz around GraalVM in recent years at Devoxx and the wider Java industry, looking at how it can improve the startup time of your Java application. This year the speakers drilled even deeper into how to speed up an app’s start time, without losing its efficiency. My favourite talk in this area, the catchily-named “Java on CRaC” (CRaC = Co-Ordinated Restore at Checkpoint, I rather suspect the initials were chosen first…) looked into how we could start applications with the speed of a native image without losing the efficiency savings which come from running the Just-in-Time (JIT) compiler. In summary, Java bytecode runs on a JVM - Java virtual machine - which, at startup of the application, compiles frequently-executed code to native machine code. This takes a while, and to speed things up it’s possible with GraalVM to use Ahead-of-Time (AOT) compilation and run this slow process of creating a native image BEFORE the application starts. A great use-case for this is serverless functions - to be as efficient as possible with our compute time, we’d like a serverless function to scale down to zero instances in production until it’s called, then spin up in a timely manner and execute our call when we want it. There is, of course, a downside to AOT. Creating your native image before startup means the application can’t be as effectively profiled to identify “hot-spots” so overall performance is typically lower. According to the talk, applications started from an AOT image are about 0.6 times the speed of a JIT-compiled application; although you can raise this to about 0.8 times the speed with some extra performance evaluation during compilation.

The answer suggested by this talk was to start an application with the JIT compiler, but then “freeze” it once it was running and save that frozen state. Future starts of the application could use the frozen state kind of like an AOT image, meaning you get all the benefits of JIT compilation and also instant start-up. The statistics shown in the talk were impressive to say the least. Java apps on CRaC start up 2 orders of magnitude faster.

Shrinking your Apps

Quarkus was even more omnipresent at Devoxx - with the “Quarkus World Tour” on the RedHat stand and numerous talks - including “Integrating systems in the age of Quarkus, serverless and Kafka”, “Migrating a Spring Boot app to Quarkus, challenge accepted”. The message is clear, the modern focus is on improving the speed and footprint of your application; being in the cloud is a given, the next stage is being the best cloud app that you can. There is a warning though about blindly focussing on your applications. I attended a very interesting talk on how to tune your Java virtual machine (JVM) to better support a container runtime - have you ever thought about how Kubernetes allocates CPUs to the container where your JVM is running? Kubernetes allocates a time slice of a physical CPU to the container, and the JVM translates these “minicores” into processes. Assumptions about the best garbage collection model to use (serial or parallel? C1 or C2?) are made by the JVM based on its perception of how many CPUs it has available - and when this is abstracted by a container and Kubernetes, the JVM often gets this wrong. The consequences of this are, for example, serial garbage collection freezing your entire application instead of utilising the multiple processors you may be paying for. For more information on JVM tuning for Kubernetes, the speaker recommended checking out Monika Beckwith’s tuning video on InfoQ. You can also use the Java Flight Recorder to better understand what your JVM is doing.

Minimising your Estate

Another angle to cloud efficiency that came through strongly at Devoxx was the serverless model; the ability to scale applications to zero when not in use. And this doesn’t just include business applications - support applications were also considered. In his talk on Java observability, RedHat’s Ben Evans talked about how some companies could have, for example, 200 servers just for analysing log data! Think about it - you need storage active all the time to capture your log data, sure, (and of course not on the machines that generate the log messages - hard to investigate an outage if your logs were on the machine that went down!) and you need listeners ready to send monitoring/alerting messages, but for complex log analysis you should be able to just spin up the data-reading machines when you need them. Many of the talk demos used KNative to demonstrate apps scaling up/down from zero instances based on load. Our Capgemini talk discusses our CREATE accelerator which spins up the whole development environment - pipelines and all - on checkin of a change. Think about it - do you have Jenkins/Concourse build servers sitting there and eating expensive CPU time when they’re not being used? Not necessary!

More than just a nod to security

As expected, especially with Snyk as platinum sponsors, there were several talks focussing on application security. I went to one detailing some interesting but unlikely ways that, given an unfortunate series of events, deserialization of Java objects or even of JSON could lead to injection attacks allowing hackers to launch applications on your machine. Our own talk focussed on ensuring a secure software supply chain. It’s worthy of a blog in itself, and indeed the talk could have been several talks.

To summarise:

Ensure minimum permissions for every communication
Ensure the provenance of your artifacts,
Aim for zero-trust but be aware it’s idealistic and you may not be able to achieve it,
Be aware that the biggest security threat to your system is YOU!

Capgemini Software Engineering have done it all for you with our cloud accelerator, a series of open-source products tied into an architecture which will spin you up an entire cloud-based secure software supply chain for your development needs - and allow you to tear it all down whenever it’s not in use. Get in touch with us for more information!

And, just… Better Apps!

Another talk I really enjoyed was Functional Programming in Kotlin - exploring Arrow. I’d seen a little of Kotlin before but this talk came in from the angle of problem-solving: Have you been bitten before by support issues involving NullPointerExceptions or ArrayIndexOutOfBoundExceptions? Of course you have! Speaker Ties van de Ven had, too. It was in search of a solution to this pain that he discovered Kotlin’s Arrow library and how it can find these exceptions - at compile time! Yes, really. Using monads (Quote: “If you know what a monad is, you can’t describe it”…) to define a return type that is EITHER an exception OR the value you were looking for as a starting point, you are then forced to deal with the two circumstances. Or you can go a step further and use Arrow Analysis library to run pre/post condition checks at compile time.

So, if you built precondition checks that n>0 into your divide(n) function, this code will compile:

if(a>0) {
 divide(a);
}

But, this code won’t:

divide(a);

Wow! And there was me thinking Kotlin was all syntactic sugar and writing less code. I’m a convert.

The Efficient Cloud Era was originally published by Capgemini at Capgemini Software Engineering on May 20, 2022.

Next.js Fundamentals Through Examples

2022-02-11T00:00:00+00:00

Introduction

As static site generation becomes more and more popular so does the need for the right tools and frameworks.

More often than not there are times when we need lightweight tools such as React to consume a decoupled service and serve static pages. There are a lot of benefits of having a tool to generate static pages for various different reasons such as speed, caching using CDN, SEO etc.

The purpose of this blog post though is to explain Next.js fundamentals - this great React framework that is growing in popularity. We will pick some of its features and will explain with examples and by doing so hopefully it will give readers a good grasp of the framework.

Assumption

As we will explain some of the Next.js features starting from beginning, the assumption is that you are familiar with React or JavaScript.

Prerequisites

Next.js requires Node.js to be installed. If you have already installed Node.js to check the version you have run node -v on your terminal and compare with the latest version Node.js - the minimum supported version is Node.js 12.22.0.

What is Next.js

Next.js is a React based framework built on top of Node.js, with Next.js you can do a wide range of things from creating APIs to consuming external/internal APIs to server side rendering, static generation and a lot more.

As we know React is a hugely popular library, see some statistics here, but it is only the view in the MVC (Model View Controller). Next.js on the other hand is framework that is built on top of React and Node.js. The React documentation makes mention of Next.js as one of recommended toolchain.

Basic Features

There are lots of features that Next.js supports out the box but here I will only focus on some of them such as:

TypeScript
Code Splitting
Routing
Static Generation
Data fetching

Without further ado let’s get started on creating the application.

Creating a Next.js Application

Next.js supports TypeScript out of the box, which means you don’t need to do any additional configuration.

To create we can use the following command:

npx create-next-app
# or
yarn create next-app

For those of you not familiar, npx is a package runner and CLI tool which makes it easy to install and mange dependencies hosted in npm registry. yarn is a package manager.

TypeScript

As we mentioned above TypeScript is fully supported in Next.js and to create an application that uses TypeScript you would type the following command:

npx create-next-app --ts
#or
yarn create next-app --typescript

As you can see we use --ts and --typescript flags to tell the CLI tool create-next-app to create the application using TypeScript.

Now let’s create our application by navigating to your preferred directory and type this command on your terminal:

npx create-next-app nextjs-with-typescript --ts

nextjs-with-typescript is the name of our application.

The above script will install all the necessary dependencies and when it finishes will print in the screen some commands that you can use to run the application. That’s all it is to that, you will now be able to run the application without extra configuration that you would normally need in order to compile TypeScript.

Code Splitting

Code splitting is an optimisation technique that splits the code in chunks or small bundles which then can be loaded on demand or in parallel, this way it enables the application to load a lot faster.

The importance of code splitting can be best seen on growing application, as the application grows so does the size of the JavaScript file.

Next.js Code Splitting

Next.js has built-in support for code splitting, that means you don’t have to use any external plugins such as babel.

When loading the page Next.js only loads the JavaScript necessary for that particular page. Next.js does this by analysing the resources that is importing. If for example one of your pages makes use of axios library, then that specific page will include axios in its bundle. In this way we make sure that we only send the JavaScript needed to the client.

Next.js also supports dynamic import(), this feature makes it possible to import JavaScript modules dynamically and load each import as a separate chunks. To get an understanding how that is done you can have a look at your application’s built directory which is .next.

As we mentioned earlier, when you create the application it will generate a README file which has some basic information such as how to run and build the application, for convenience, I’ll list some commands here:

# to run the application
npm dev 
# or 
yarn dev

# to build the application
npm build 
# or 
yarn build

In order to see the built folder you would need to build your application.
Once that done you can then navigate to .next directory, there you will see something like the following:

|- chunks
  |- {someNumber}.{hash}.js
  |- commons.{hash}.js
|- runtime
  |- main-{hash}.js
  |- webpack-{hash}.js
|- {hash}/pages
  |- _app.txs
  |- _error.txs
  |- index.txs

As we can see from the above, the code splitting is done by chunks, runtime and by page.

Routing

Routing is another feature that Next.js supports out of the box. Next.js uses the file system to enable routing, every file that you put under the pages directory with the extension .js, .jsx, ts or tsx automatically becomes a route.

Pages A Next.js page is a React Component, in our application that we created earlier on we are going to go ahead and create a page under pages directory called about.tsx as follows:

// file pages/about.tsx
function About(){
  return <h1>About</h1>
}
export default About

The above is a React Component that simply returns a h1 heading. Now If we run the application and go to /about route, you will see the about page that has About tag that we just created. That’s how easy it is to create a route.

Index Routes

In Next.js a file named index.ts or index.js in the root directory of any directory under the pages directory, will automatically be a route.

Here are some examples:

Creating an index.ts page at pages/index.ts will create a route /.
Creating a page at pages/blog/index.ts will create a route at /blog

Nested Routes

If we need to create a nested structure then under pages directory we would create directories and files which would then map to the routes. Here are some examples:

pages/articles/my-first-article.tsx will create a route /articles/my-first-article.
pages/admin/settings/user.tsx will create a route at /admin/settings/user

To demonstrate this we will create the following two pages in our app.

// file pages/articles/my-first-article.tsx
function MyFirstArticle(){
  return <h1>My First Article</h1>
}
export default MyFirstArticle

// file pages/admin/settings/user.tsx
function User(){
  return <h1>User</h1>
}
export default User

Now if you run the application and navigate to /articles/my-first-article or /admin/settings/user you will see the above components being served respectively.

Dynamic Routes

As explained above routes are defined based on the file and folders that we create under pages i.e. every file in there maps to a route.

Having said that, there are lots of cases in more complex applications that predefined routes are not enough and hence where the dynamic routes come in.

To create dynamic routes you can use square brackets in the name of the file like so [param].

In the following we are going to create a dynamic route so when people go to articles/<id>, the article id will be sent as a query parameter to the page, in turn we can get this article id using router query object and do further processing.

Let’s start by creating [aid].tsx file under pages/articles/ like the following:

// file pages/articles/[aid].tsx
import {useRouter} from 'next/router'

const Article = () => {
  const router = useRouter()
  const {aid} = router.query

  return <p>Article id: {aid}</p>;
}

export default Article;

The route articles/3 will be matched by [aid].tsx so now If you go to articles/3 it will display Article id: 3. The route articles/3 will have this query object {aid:'3'}. The id can be anything that serves best your needs that is to say it can be a string a number etc.

If the route has articles/3?foo=bar then router query object will have {foo:'bar', aid:'3'}, i.e. if you do console.log(router.query) you will then see the above values.

Nested Multiple Dynamic Routes

In cases where you need two levels of the route to be dynamic such as articles/3/a-comment then you would create a folder and another file under that folder. Say for example you want to capture the article id and its comment something like this http://localhost:3000/<id>/<comment> then in this case would create a directory [aid] under the pages and a TypeScript file under [aid] so then you would have this structure pages/articles/[aid]/[comment].tsx.

Then in the [comment].tsx file put the following:

// file pages/articles/[aid]/[comment].tsx
import {useRouter} from 'next/router'

const Comment = () => {
  const router = useRouter()
  const {comment} = router.query

  return <p>Comment: {comment}</p>;
}

export default Comment;

If you go to this route now articles/3/a-comment the query will have

{comment:'a-comment', aid:'3'}

As inside directory [aid] we are mapping several routes then one way to tackle this to create an index file inside [aid] directory which would match /article/<id> and another one in our case [comment].tsx to map articles/<id>/<comment>. The folder structure would then be like the below:

.
├── [aid]
│   ├── [comment].tsx
│   └── index.tsx
└── my-first-article.tsx

Catching all Routes

In cases when you want to catch all routes, then first you would create a file under the preferred directory in our case pages/[...slug] like so:

// file pages/[...slug].tsx

import {useRouter} from 'next/router'

const CatchAll = () => {
  const router = useRouter()

  return <p>This page catches all routes</p>
}

export default CatchAll;

If you now run the application and navigate to http://localhost:3000/a it will display the page that we created above.

If you navigate to http://localhost:3000/a/b and observer the query like so console.log(router.query) then you will notice that slug now has the following:

{
  "slug": [
        "a",
        "b"
    ]
}

Static Generation

One of the great features that Next.js has is static site generation. There are a lot of frameworks can generate static sites, but what makes Next.js different to other SSG frameworks is the fact that Next.js is hybrid tool which can generate HTML/CSS/JavaScript at run time as well as at build time, this and lots of other features makes Next.js a truly great React SSG framework.

Next.js has two form of pre-rendering, Static Generation and Server-side Rendering. The difference between the two is when static assets, such as HTML, JavaScript, CSS etc., are generated.

With the Static Generation HTML pages are generated at build time and after that they are reused at each request, whereas with Server-side Rendering the pages are generated at each request.

If Static Generation is used pages will be generated when you run next build and from there you can use a CDN if you want to cache the assets.

Static Generation without Data Fetching

This is a simple rendering of static pages such at the page that we have created at the beginning my-first-article.tsx

function MyFirstArticle(){
  return <h1>My First Article</h1>
}
export default MyFirstArticle

Once we created this page and build the application this page will be available as am HTML page.

Data Fetching and Static Generation

In cases where you have to fetch the data from an API or similar then you would use built-in function called getStaticProps to fetch the data at build time and then serve static pages.

If you are also catching paths then you would need to use getStaticPaths built-in function. This function can be used in addition to the getStaticProps.

Data Fetching and Static Generation Working Example

In the following will got through an example to illustrate this. Let’s say we are capturing article data from an API the way we would implement that is to first fetch the data using getStaticProps then we would have another function to consume those data.

In the following example we will use an example endpoint https://jsonplaceholder.typicode.com/posts that has random posts.

function Article({props}) {
  // Here you can further work with props and manipulate data as required
}

export async function getStaticProps() {
  const res = await fetch('https://jsonplaceholder.typicode.com/posts');
  const articles = await res.json()

  return {
    props: {
      articles,
    }
  }
}

export default Article;

In the above functions we are catching all the data from the endpoint and returning them as props, then in turn we are passing the props to the above function as a parameter where we can do further processing.

Conclusion

If you are familiar with JavaScript, Next.js is very easy to use and learn. Out of the box support for TypeScript, server-side rendering, static page generation and lots more, makes Next.js a very strong contender to use for your next project, be it a blog or a complex application that consumes an internal/external API.

Next.js Fundamentals Through Examples was originally published by Capgemini at Capgemini Software Engineering on February 11, 2022.

Introducing the DCX React Library

2022-01-21T00:00:00+00:00

At Capgemini, and for most projects, we usually create the same standard basic components for example buttons, form inputs, radio buttons, etc. These components can be time consuming to set up initially, given we need to implement all the logic like validation, user interaction, tests, documentation, QA validation, AA accessible and fix any bugs raised also. This step can take a fair bit of time and effort, especially when building solid foundations for our software. It can take up to 3 to 4 sprints for a full team to build a set of high quality common components and this is why in DCX (Digital Customer Experience) we decided to start building a library that will speed up this initial process.

Day 0: The initial challenge

When our DCX team (Digital Customer Experience) at Capgemini started building our first library we were under the impression that a component library was of the same effort as creating a web application for a client. The team were very much used to developing in client-oriented development processes, with a well guided set of requirements usually driven by a business analyst, style guide and colour palette from a UI/UX designer. Conversely, the development process of a library with no pre-defined requirements was an uncharted territory so the first question the team faced was “Where do we start?”. After a few weeks, the team came to the realisation that the strategy had to be re-evaluated and this blog post will explain the journey thus far and will share the story of the development of the dcx-react-library.

Day 1: Why do we need another library?

Sometimes a developer may suggest using bootstrap or react-material for the UI. After all, these libraries already have a pre-built set of components that will speed up our development. The problem using these common libraries is that our clients’ brand and project requirements need the UX to be specific and unique.

We’ve tried it many times before and it hasn’t worked. We really need our own set of base components… no hold on I got an idea… what about if we go on Google and research some components we need and we can import into our project? That should work.

Well, occasionally a third-party library works but after few months you may regret your decision for several reasons, such as:

My project has custom requirements the library can’t support. You could create a ticket on the GitHub project in the hope that the owner implements it. Hopefully within a few weeks or a month. Hopefully.
There’s a bug. You create a ticket on the GitHub page, and no one replies to you, all the while your client is pressuring you to resolve the problem that’s impacting all their customers.
You get lucky and the library you found and implemented looks and works great out of the box. But… the component you chose has 50 dependencies and after a while some are obsolete, and others have security vulnerabilities. It’s not long before your project is failing all the health checks.
The component you imported is now deprecated and no longer maintained by the author.

I can easily continue listing more reasons, but I think you understand what prompted us to create our own library.

Day 2: What will our library look like?

After we all agreed that relying on different libraries is not a sustainable approach, we started to think how we could build a library that can be used by all our clients and what kind of components we need to create.

We didn’t want to repeat the same mistakes and limitations we saw in the other libraries, so we decided to implement our library our way:

Style agnostic: all components are built without styling, so consumers are able to style components as desired based on a set requirements of the clients and UI/UX experts.
Few dependencies: we don’t want to rely on external dependencies but at the same time we don’t want to reinvent the wheel, so we decided to use few external dependencies (only 5) that each have 0 dependencies themselves.
With super-powers: Every component is both flexible and extensible, there are intrinsically built with possible requirement and use cases in mind.
Small bundle: every possible technique should be used to provide a tiny bundle consisting of only a few kilobytes.
Fully tested: our library will have 100% code coverage.
AA accessible: every component is built and tested to be 100% accessible.

Day 3: How will we add documentation to our library?

I used to Google a lot, looking for the next cool library to use but I’m simply not patient enough to read thousands of lines of code because a library isn’t well documented. I know it’s really boring for most developers to write good documentation but if we want someone else to use our library, we honestly don’t have a choice.

Our documentation contains around 190 stories, is organised by components and each component has 4 main sections:

Documentation: explains the aim of the component with a simple example and lists all the available properties while providing a detailed description for all of them.
Un-styled: UI/UX designers may be shocked to see this section because they’re going to see how the component looks naked, without any styles applied.
Styled: UI/UX designers will feel better here. In this section we provide an example of how the components look once a style is applied. We also provide the code and CSS that can be copied for use.
Live: this is the section that we love the most. In this section developers can interact with the code, adding and removing properties to see how the component renders.

Day 4: How many components have we built so far?

OK, I need to come clean… we didn’t just spend 3 days getting to this point. It’s taken considerably more and at the time of writing this post we’ve released version 0.3.6 into production, and we’ve already started working on version 0.4. The library has more than 20 components and most of them are being used in some of our client projects.

We look forward to creating more and will continue to listen to our developers and clients, adding more functionality and simplifying the usage of the components.

What’s next?

If you’d like to know more about the library, feel free to contact Daniele Zurico or Isaac Babalola.
If you curious to give it a try, it’s publicly available on npm.
If you’d like to contribute, you will need to request access from our private repository.
If you want to familiarise with all the components we built, take a look at our storybook documentation.

Introducing the DCX React Library was originally published by Capgemini at Capgemini Software Engineering on January 21, 2022.

The Software Engineering Gender Gap

2021-10-15T00:00:00+01:00

A Brief History Lesson

There is currently a large disparity in the percentage of female vs male Software Engineers. According to Wise women make up just 16.4% of the IT Engineering workforce. In education another study shows that Computing has one of the lowest take-ups by women across all Engineering categories, with just 16% of degree candidates being female.

But it wasn’t always this way. There have been plenty of female role models within software engineering in the past, in fact computers wouldn’t be what they are today without these pioneering women:

Ada Lovelace (1815 – 1852) - regarded as the first computer programmer
Grace Hopper (1906 - 1992) - a computer programming pioneer who invented the compiler and theory behind high-level programming languages
Dorothy Vaughan (1910 - 2008) - NASA’s first African-American manager, taught herself and her staff Fortran
Mary Kenneth Keller (1913 - 1985) - first woman to earn a doctorate in computing in the US, part of the team who created BASIC
The ENIAC Programmers (1945) - 6 women who were the first programmers of the first digital computer
Carol Shaw (1955 - ) - one of the first female game designers
Radia Perlman (1958 - ) - invented the spanning-tree protocol, and contributed to many areas of network design and internet routing

Computer technology really emerged during World War II, at this time women made up the majority of the engineering workforce. By the 1960s men made up a large majority of all workers. But in software more than one in four programmers were still women. Computing was largely seen as a woman’s job, and women benefitted from a positive stereotype at this time. However there was a large pay gap between women and their male counterparts, according to the book Recoding Gender: Women’s Changing Participation in Computing, in 1969 the median salary for female computer specialists was $7,763, where men earned a median of $11,193 doing the same job! In spite of this, until 1984 the uptake of women in software engineering was increasing. According to the below graph (admittedly from an American source) the percentage of women studying software engineering degrees topped out at 37%, and has been declining ever since.

There are many theories for the growing gender disparity. One possible reason is the 1970s recession meant programmers weren’t in demand at this time which could have led to a drop off in female uptake in the field. Another potential reason is the production of personal computers increased the male uptake in software engineering. A 1985 Apple advert showed how much a computer could help boys, and showed a boy teasing a girl who was trying to use a computer. Advertising like this caused stereotypes to shift and computers became perceived as a thing for boys.

How the UK Compares

As previously stated, women make up just 16.4% of the IT workforce in the UK. But other countries have a much higher percentage of women in this area. The graph below shows several countries where women make up 25% more of employees in their tech industry:

Image from: European Women In Tech

According to Unesco, 50% of Malaysia’s engineers are women, and for Oman the figure is 53%. These figures highlight that the UK doesn’t do well when it comes to gender diversity in the workplace, and that there is lots of potential for improvement in this area.

At the time of writing, within Capgemini women make up just over 30% of our junior grades but this decreases to less than 20% of our senior grades and roles. At Capgemini in India, women employees constitute over 35% percent of our workforce. Within our Open Source Cloud Engineering team women make up just 15.4%.

Women in software today

There is lots of work going on to improve gender diversity in the workplace. For 5 years in a row Capgemini has been included in the Times Top 50 Employers for women list. There are also lots of female role models around today, such as:

Parisa Tabriz - A Director of Engineering at Google
Juliana Rotich - A tech entrepreneur and strategic advisor
Shafi Goldwasser - A pioneer in cryptography, won the Turing Award in 2012
Jade Raymond - A video game developer, led the creation of Assassins Creed, founded Ubisoft’s Toronto subsidiary
Sara Haider - Engineer for Android apps at Twitter, she was the leader of development for Vine that had 200 million users at its peak
Amanda Wixted - An iOS app developer, led development of FarmVille, founder of Meteor Grove Software
Women at Capgemini - Capgemini have many great female engineers

There are a lot of stereotypes that are still being applied to women in software engineering, but this is no longer tolerated within mainstream companies. In 2017, a Google employee named James Damore wrote in an internal email about qualities he thought were more commonly found in women. This included higher rates of anxiety, and he assumed this explained why they weren’t thriving in a competitive world of coding. Google fired Damore, however his opinion does reflect what some people within the software industry think, and highlights the stereotypes that are often applied. Most companies though do not allow this kind of behaviour, and the Capgemini software engineering team responded to Damore’s comments with this blog post.

Another issue that is still prevalent is the pay disparity between genders. This is often caused by men holding more senior positions within a company compared to women. The gender pay gap for the UK was 15.5% in 2020 according to Statista, this gap has been narrowing in recent years and is down from 27.5% in 1997.

What can be done

Having researched the statistics of various countries, and worked with many women, I can safely say they are just as capable as men at software engineering. In fact gender diversity benefits businesses in the same way that any diversity does, more ideas and increased flexibility leads to better products! There are many stats that show having a better gender balance increases the profitability of a company, for example Forbes and Gallup. Being more gender diverse leads to a more innovative business, as diverse employees come with different experiences and viewpoints. This increases a businesses ability to solve problems which is key in the software engineering game. According to the 2011 census women make up 51% of the population in England and Wales, that is a lot of potential talent that software companies could be tapping into. At the moment with a male dominated workforce we are only utilising 49% of the talent available. Imagine what could be achieved with 100% of that talent! I personally quite fancy a smartphone that is also a robot, I think this one is in need of some gender diverse talent!

We have a role as engineers to ensure the workplace is a fully inclusive space. There are many stereotypes and unconscious biases around female software engineers that aren’t true and aren’t acceptable within todays society. Everyone has a responsibility to stop any prejudice they encounter. Capgemini does a lots of work in this specific area, and has published an Active Inclusion Strategy which specifically highlights unconscious bias, and includes a range of materials and information to try an combat any unconscious bias coming from a lack of awareness/information.

The gender diversity gap in software engineering isn’t something that can be solved overnight, but we can all play our part in helping to encourage more women into software engineering careers.

The Software Engineering Gender Gap was originally published by Capgemini at Capgemini Software Engineering on October 15, 2021.

Micro Frontends: an Introduction

2021-10-08T00:00:00+01:00

Microservices & Distributed Systems in General

The growth in popularity of distributed systems is not without reason. Organisations have found that the separation of workstreams along domains / business capabilities has provided teams with a greater degree of product ownership. There is an increase in the overall complexity of the aggregated services of course, as interactions between services have to be agreed (often between multiple teams), but the advantages of independently deployable & scalable components and greater team autonomy should make microservices an attractive option for most projects.

There is a caveat to this however: the term “microservice” is pretty much exclusively used to refer to backend web services. These are services that are usually obscured behind an API gateway and a user interface of some kind. There have been lots of weird and wonderful innovations to improve the way we design and build distributed systems, but that effort is primarily focused around improving consistency and availability for your backend services (not to mention the countless frameworks, plugins and tools to marry with whatever patterns you decide to go with for your system). This isn’t exactly surprising, if you’re going for as close to perfect as possible for your system’s consistency and availability then you’re going to focus on the services that are actually performing the tasks required of the system.

User & Developer Experiences

If we move out of the system’s boundary and look at how our users see it from an external view, we’ll find that they don’t (or rather they really shouldn’t) see anything that gives away its nature or underlying architecture. Users expect a seamless experience when they interact with an application, aesthetically as well as functionally. If your interface seems to change its design system in places, that’s going to make your system look like a patchwork of different bits of software and the illusion of a single consistent application is broken. If your interface doesn’t have clear and unified paths for performing actions, then your users may not be able to use it and you’ve suddenly got a very serious problem.

These risks, combined with the fact that interfaces like web apps are essentially just bundles of static assets that users have access to, has led to the default architectural option being some flavour of monolith for frontend applications. Now I’m sure I don’t have to go through the effort of explaining where monolithic applications fall short, those of you who haven’t had the pleasure of building or maintaining one will undoubtedly have heard tales from those who have. However, it’s worth touching on some techniques that have been used to attempt to distribute work on a monolith between different teams.

A module-based approach can seem like a good idea on first consideration, as you can neatly divide up an application and give ownership of them to various teams. What should also be considered is the aggregation of these modules into the final artifact. At some point, these components are going to have to be bundled together, tested, packaged into a deliverable and deployed. The coupling between these modules is usually tight, which increases the likelihood of changes in one component impacting another. If this is the case, then you’re not making the most of a distributed system since your teams will still have to negotiate when making changes that affect the interfaces between their components.

Now that they’ve been put into the appropriate context, we can finally start talking about micro frontends.

Micro Frontends: a New Alternative

Let’s start with a (fairly vague) definition:

A micro frontend is a semi-independent component that can be independently deployed, and dynamically integrated into, a user interface.

This doesn’t really do much for us, so let’s dig into it a bit. What we’re basically talking about is an extension of the thinking that brought microservices into being - what if we cut up our frontend along some meaningful boundaries and developed them separately? You can split the work between teams and let them build their own components in their own way, which is the first advantage this approach brings. Each team doesn’t have to negotiate with every other team to be able to use their preferred tools, since they own the development and deployment aspects of their product. You could have teams using completely different JS libraries, built using different CI tools and deployed to different platforms, and you can still integrate their work together to form a single seamless frontend. This approach will be appreciated by the teams, as they’re freed from the need to agree with all other teams on what tools they should all use.

Since your teams can all go off and deliver the functionality they’re responsible for, you can also bring in more contributors without impacting developer experience (please note I’m emphasizing the word “can”, as simply throwing bodies haphazardly at a project tends to make things worse instead of better). As long as your frontend is decomposed sensibly, and teams are working together to communicate across their component boundaries properly then you shouldn’t have any problems.

NOTE: At this point it’s worth mentioning that micro frontends are still evolving as a concept and there are lots of ways to use them, so take some of this with a pinch of salt. You should apply these methods in a way that works best for your project, but I can speak from experience that the techniques mentioned in this article have worked for my past projects.

If we take a standard web application and start cutting it up, we can see some clear potential micro frontends:

An ecommerce application, with highlighted sections that can be developed as micro frontends

Here we can see that there are micro frontends to be made for the navigation bar at the top of the page, the cart summary popup, and the store item components. In theory you could have each of these developed and owned by different teams, and integrated together into a single application.

Integration

It’s all well and good to develop parts of an application in isolation, but at some point, you’re going to have to stitch everything together. For microservices this is usually achieved with an API gateway, which serves as the entry point into a system and sometimes takes care of cross-cutting concerns such as authentication pathways. Micro frontends also have a single entry point when a user accesses them, usually referred to as a container or root component. When the application is loaded up this is what is given to the browser, and within it is the code required to load up other components from different micro frontends.

The difference between micro frontends and module-based monoliths should be made clear here; the container application will dynamically load components into the browser DOM from remote sources when they are needed, the initial bundles of JS and HTML that are returned from the user’s initial request do not contain any components belonging to other micro frontends. Instead of this, the root component will contain URLs to various micro frontend resources that are retrieved separately (these could be hosted at a different address, built by a different team using different technologies, and deployed using a different cloud services provider). This gives teams the advantage of being able to deploy a new version of a micro frontend, and have this new version become immediately available through the root component without having to pull new versions into any other components and build them again. You will still need to make sure that any contracts you have established between components are still being respected of course, but you can use the usual techniques for dealing with breaking changes when required.

State & Communication

Building an application with completely isolated components would be a very straightforward thing to do, but I’m sure we’re all aware that it is rarely a scenario that comes up on actual projects. It’s usually inevitable that components will eventually need to communicate with each other, responding to either the user or another system’s actions. Now one way to tackle this would be by following the unidirectional data flow pattern, utilizing libraries like Redux to manage state. This pattern is useful for decoupling UI components from state modifications, but the concept of a central state store presents some problems for us in an application composed of micro frontends.

It would be tempting to keep a state store in the root component and have this passed into micro frontends when they are loaded. This would give each micro frontend access to the same state store, allowing them to share data with each other and update themselves when any changes to the state object are made. The problem with this is that you are adding a dependency on the type of state management library the root component is using, which goes against the idea that each micro frontend should be independent. Micro frontends should be able to differ in their implementation without affecting other components in the final application, so restricting them in this way should be avoided wherever possible.

We can allow each micro frontend to manage its own internal state, but there is still the problem of communication. It could be tempting to define interfaces that are implemented in each micro frontend and made available to any component in the application that needs to use them, but this can create tightly coupled micro frontends which makes things difficult to maintain and augment. It is here that we could get some benefit from thinking about this in a different way.

Instead of thinking about side-effects of actions as something that a component applies to another component through a function call, we can think about an action as an event that is broadcasted via a message broker. By doing this we can keep things loosely coupled, allowing easy integration with other components through subscribing to particular message channels or topics. For example, if we had a shopping application with an “add to cart” button, this button could trigger an event that all interested components (a cart contents component for example) could subscribe to. We can use a number of tools (my current preferred choice being Postal.js) to achieve this, adding relevant data (item IDs etc.) into the body of the event if required.

An advantage of this event-based approach is that you can document your APIs to be used by other teams when integrating with your micro frontend. Tools like AsyncAPI excel at this, giving you a neat, unified definition of all the channels and events that your component(s) will potentially emit, and what they signify. You can also version your events as you would for event-driven systems, allowing for gradual phasing out of old event processing as your system matures and changes. Your API specification can also include information on the build tools you are currently using to export your micro frontend, and the dependencies that it is set up to share (shared dependency resolution is a feature of a number of frameworks used to integrate micro frontends, and would be very important for teams to see when preparing to integrate with your components).

Potential Problems

Building a micro frontend-based application has many advantages, but there are also some potential issues that you should be aware of so your team are spared some painful problems in the future.

Integration technology alignment: you will have to ensure that all teams working on micro frontends are working with the same technologies for exporting their components, as the root application will need to load them all in. There are various options out there, Webpack’s Module Federation plugin and single-spa to name a couple. Module Federation is very flexible, allowing you fine-grained controls over things like shared dependencies. However, single-spa is very quick to get set up and comes with very useful run configurations out of the box (to assist with local development and running micro frontends in isolation). I have had success in the past with the Webpack solution, but you should choose the tool that’s going to work best for your team(s).
Performance: as your development teams are given the freedom to choose the libraries that they use in their micro frontends, there is a risk that your application will become bloated. There are patterns and techniques you can introduce to mitigate this, both technical and organisational. If you have dependencies that are used in multiple micro frontends, you can configure whatever tool you’re using for builds to use dependencies that are shared by the root component. For example, if you’re using React in a number of micro frontends then you can add this in the root as a shared dependency in your Webpack or single-spa config, and then any micro frontend that uses React will be able to load without having a duplicate version of React bundled into its own JS files. You can also collaborate between your various teams to select various preferred libraries, which will limit the amount of different dependencies being loaded into the final application.
Debugging: As you can imagine, debugging an application with lots of event-driven micro frontends can get quite complicated, especially when initially integrating new components into an application. If you’re using an event bus like Postal.js, then using a plugin like postal.diagnostics can help you track events passing over the boundaries separating your micro frontends. There can also be some difficulties when working with your build tool to correctly bundle and load micro frontends into an application. This can be very difficult to debug when you’re having loading issues, so I’d advise creating a scaffold with a bare-bones implementation for exporting a micro frontend using something like Yeoman. With this you can tailor your generator to suit the functionality that your team may need for preparing a new component for integration, decreasing the possibility of running into integration issues when introducing a new micro frontend into your application.
Styling clashes: since you will be loading components built by different teams into the same DOM, you’ll need to put measures in place to avoid CSS class clashes. This can be solved by agreeing to use prefixes in all your CSS classes, or using something like CSS modules.

Conclusion

Micro frontends have a lot of potential if used correctly. It is possible to build large, complex applications by allowing different teams to own part of the web application, but you should take care before you jump straight into using them. As with most patterns and techniques, preparation is key. As long as your teams are communicating properly, and cross-cutting concerns are being handled in a way that is understood by all involved then you shouldn’t run into any problems (at least none that can’t be solved in a relatively straightforward manner).

Micro Frontends: an Introduction was originally published by Capgemini at Capgemini Software Engineering on October 08, 2021.