Capgemini Software EngineeringJekyll2024-02-16T07:35:51+00:00https://capgemini.github.io/Capgeminihttps://capgemini.github.io/https://capgemini.github.io/cloud/create-ai-bot-in-azure2024-01-19T00:00:00+00:002024-01-19T00:00:00+00:00Sarah Saundershttps://capgemini.github.io/authors#author-sarah-saunders
<p>I recently had the privilege of judging an internal Capgemini hackathon. It was an open brief, but the focus was to be on technology and its application to solve a real-world business problem. The entries were varied and excellent, from a dashboard to assess how warm/busy/accessible the office was so you could decide whether or not it was worth going in, to gamification of training, to improvements for mountain search and rescue teams.
One of the major commonalities across many of the entries was the use of “AI”, where, given our common use of the Azure platform, AI tended to be defined as Azure cognitive search (recently renamed <a href="https://azure.microsoft.com/en-us/products/ai-services/ai-search">Azure AI</a>) indexing a set of business documents, with a natural language processing unit on top to act as a chatbot. This made me want to have a go myself, and see what I could build!</p>
<h2 id="is-it-ai">Is it AI?</h2>
<p>This architecture, for me, isn’t really using the “AI” bits of AI - despite the fact that if you use ChatGPT (which is <a href="https://learn.microsoft.com/en-us/azure/ai-services/openai/">available as a product in Azure</a> since Microsoft’s purchase of OpenAI) there can be some non-deterministic, generative functionality, but it sure is useful and could probably ease the burden of the HR and support departments of many organisations - and could possibly even replace a lot of the staff in these departments. I set out to see if I could build a HR chatbot to replace the kinds of queries a typical HR department employee might need to deal with. Why HR? Just because everybody hates them?? No…! - it’s because of the remit of HR, dealing with the employee lifecycle and needing to prove that a company acts without bias it must be a heavily process-driven department. These processes must be documented, and most of the workload of the department is in dealing with queries regarding the process. The incoming questions are probably not phrased in the same way as the process documentation, so some sort of fuzzy search is required in order to automate the question-answering process; for example, translating “how much paid time off do I get when my baby is born?” to “paternity leave allowances” is not a straightforward mapping. This is the reason that previous attempts to automate such departments have failed. Language is too complex for simple mappings and decision trees to replace a person on the end of a line - as anyone who has tried to navigate an automated telephone call will tell you. Who hasn’t ended up shouting “I WANT TO SPEAK TO A PERSON” down the line? But at the end of the day, the workload is simply regurgitating content from a document repository and the hard bit comes in finding the relevant sections - a process that is better automated as it’s a pretty unrewarding job acting as a knowledge base for people who can’t be bothered to read swathes of documentation.</p>
<h2 id="the-architecture">The Architecture</h2>
<p>As mentioned, the hackathon had been playing with the Azure cloud, so we’ll keep to this and use Azure’s concepts. These are pretty simply translated to any hyperscaler though, or to open-source alternatives if you want to host your own. For example, on AWS you could use <a href="https://aws.amazon.com/blogs/machine-learning/building-an-nlp-powered-search-index-with-amazon-textract-and-amazon-comprehend/">Textract and Amazon Comprehend</a>, and in the OSS world you’d perhaps use <a href="https://www.nltk.org/">NLTK</a> and <a href="https://lucene.apache.org/">Lucene</a>.</p>
<p>Azure AI Search is a nice tool - a little more than document search, a little less than AI. It can be a bit clunky to get used to, and the price policy is per GB storage which is pretty bizarre - but this can be beneficial if you have query-intensive applications and a small-ish data set of documents. We use it as an exotic database view for one of our applications, and it took us a while to get used to the fuzzy query syntax - it’s not really designed for logical queries, it’s much better at giving you best-guess matches for loose search terms - and as such is well positioned to be the back-end of our HR chatbot.</p>
<h2 id="the-method">The Method</h2>
<p>I found a couple of tutorials and quick-starts to create chatbots on my documentation -
<a href="https://github.com/Azure-Samples/azure-search-openai-demo">Azure Search OpenAI demo</a>
or <a href="https://techcommunity.microsoft.com/t5/startups-at-microsoft/build-a-chatbot-to-query-your-documentation-using-langchain-and/ba-p/3833134">Query your documentation using Langchain</a></p>
<p>The issue I found is that it’s all moving quite fast - faster than the tutorials can keep up with. All mention of <a href="https://www.langchain.com/">Langchain</a> has now gone from the Azure portal (although you can still <a href="https://towardsdatascience.com/talk-to-your-sql-database-using-langchain-and-azure-openai-bb79ad22c5e2">write your own</a> Langchain chatbot), and QnA maker has now moved on and we have <a href="https://language.cognitive.azure.com/">Azure AI Language Studio</a> where you can add in your documents via a “Custom Question Answering” project, which is a type of Azure “Language” and can be created via the LoCode/NoCode <a href="https://language.cognitive.azure.com/">Language Studio homepage</a>. The tutorial speedily guides you through a simply-configured web form although it’s not quite clear what you are actually going to create - looking at what was deployed after the configuration steps, this sets up an Azure cognitive search (AI search) repository and then enables custom text classification / custom named entity recognition on the repository. The default behaviour for this appears to be breaking down the content in your referenced documents into paragraphs and pulling out likely titles/subjects. You can then modify this classification by adding in new questions and answers, or choosing the best answer for given terms.</p>
<p>The free trial only allows you to upload three sources into your AI search repository. So, for our HR example, I’ve downloaded three HR policy documents from <a href="https://staffsquared.com/free-hr-documents/">this handy online repository</a> and added them into my Custom Question Answering repository. This generates a “Knowledge Base” that I can then publish.</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/upload-docs.jpg" alt="Upload documents into your language knowledge base" /></p>
<p>Here we can see the way that the content has been divided up into major terms and paragraphs that may address those terms. I can edit here, and once it’s published I can generate a Bot to act as the user interface to it.</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/knowledge-base.jpg" alt="knowledge base parsed from documents" /></p>
<p>OK so now onto creating this Bot. As Bots go, OpenAI’s <a href="https://chat.openai.com">ChatGPT</a> is the real deal. Generative AI, pre-trained to recognise vast arrays of English language. For most use cases we would have to “turn off” all the fun, generative stuff for our application (see Guardrails below) and it’s probably overkill to use ChatGPT for this demo - plus, it isn’t included in the Azure Free Trial tier so I will be experimenting with the <a href="https://azure.microsoft.com/en-gb/products/ai-services/ai-bot-service">Azure AI Bot Service</a> instead. It should be sufficient for this fairly small and simple demo.
Cost-wise, the Azure AI Bot has a free tier, but it must be hosted via an Azure Web App whose service plan is defaulted to S1 (Standard). This plan, at £75/month to keep it running, is eating rapidly into my free credit!</p>
<p>Configuring the Bot online is pretty straightforward. The web GUI provides you with a customised template for creating the resources that you will need, creating an App Service Plan to launch an Azure WebApp that will host your Bot. The only config you have to do is enter the key of your Language Resource so that you can create a secure connection between the AI service knowledge base and the chatbot. This isn’t documented, but you can find the key by going back to the Azure Portal home and clicking the green squares “All Resources” view, then selecting your Language resource (the resource where Type = Language) and then selecting the Keys and Endpoint menu item. (There are two keys, so that you can refresh them by rotating them individually and hence avoid downtime. Either one is fine.)</p>
<h2 id="testing">Testing</h2>
<p>Once your Bot is deployed, you can test it by finding it under All Resources and choosing “Test in Web Chat” from the right hand menu.</p>
<p>I tried with a simple question, that I know is answerable with the content in the documents:
<img src="/images/2024-01-08-create-ai-bot-in-azure/good-answer.jpg" alt="Trial question" /></p>
<p>So far so good. The Bot has successfully found the right bit of my documentation and returned a comprehensive and understandable answer. How about another:</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/bad-answer.jpg" alt="Second question" /></p>
<p>Oh dear. “cannot” is not exactly a strong English sentence! Although it has found the relevant section of the documentation, it has not been able to pull out a contextual answer.
I am not sure if it is the Language Service or the Bot which is struggling with this question. Enabling and examining the logs on the Bot Service isn’t that helpful - it just shows HTTP POST requests going to the Bot framework. The Bot framework should be responsible for breaking down the user’s entered text into logical “intentions” that the back-end question-answerer can respond to, and then delivering the back-end response in a human readable form.
I eventually figure out how to <a href="https://learn.microsoft.com/en-us/azure/ai-services/diagnostic-logging">enable logging on my Language Service</a> and discover the query and response that the Bot has sent to the language service:</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/backend-query.jpg" alt="Bot query to language service" /></p>
<p>I can see that the language service has actually done a reasonable job. It’s identified the right paragraph for the query, but returned just a 38.97% certainty rating that this is the right data. Fair enough. So it seems that the issue is with the Bot being able to pull the right piece of text out of the response. This makes me start to wonder about the “Bot” I have deployed. What is it actually based on? There isn’t much documentation I can find, but you can download the source code, which shows that I have deployed something created by the <a href="https://github.com/microsoft/botbuilder-js">BotBuilder SDK</a>. I should be able to run this locally, but weirdly the Bot JavaScript code in my download seems totally out-of-date with the latest Language Studio API. I have to go back to the drawing board and use one of the <a href="https://github.com/microsoft/BotBuilder-Samples/tree/main/samples/javascript_nodejs/48.customQABot-all-features">later samples</a> and update the code to correctly declare a method asynchronous to get the Bot running locally using the Bot Framework Emulator.</p>
<p>To get it to work using Node.js v18.16.0 and restify ^11.1.0, I had to edit the sample code <a href="https://github.com/microsoft/BotBuilder-Samples/pull/3939/files">index.js line 91</a> to declare the method async or it would not start:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// Listen for incoming requests.
server.post('/api/messages', async (req, res) => {
adapter.processActivity(req, res, async (turnContext) => {
// Route the message to the bot's main handler.
await bot.run(turnContext);
});
});
</code></pre></div></div>
<p>I was then able to run the Bot locally connecting to my Azure-hosted Language via the Azure <a href="https://github.com/Microsoft/BotFramework-Emulator/blob/master/README.md">Bot Framework Emulator</a>. And of course as luck would have it, the latest sample doesn’t return such a poor response! It’s still not perfect, but it’s at least a sentence. See below.</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/via-emulator.jpg" alt="Local Bot Service running in emulator" /></p>
<p>It does also prove that the poor response here was the chatbot interpreting the data from the Language Service. The Language Service will return a field called an AnswerSpan which lists, with a confidence score, the section of the documentation it considers most relevant to the question. In the case of my “dismissal” question, the AnswerSpan returned was:</p>
<p><code class="language-plaintext highlighter-rouge">An employee whose appointment is terminated for any reason will be provided with a written statement of the reasons for the dismissal</code></p>
<p>This text was paired with a confidence score of 0.2880999999999997, or circa 29%. Fair enough. So how the cloud-deployed bot extracted the answer “cannot” from this is a bit of a mystery! The new version of my Bot prints the whole AnswerSpan and is, whilst still not exactly accurate, at least better. So how do I fix it?</p>
<h2 id="customisation">Customisation</h2>
<p>It seems the way to fix up these simple Bots is to go and add a custom question/answer into the Language Service knowledge base. I try adding a specific answer to the question, “Can I appeal against my dismissal?”. I re-publish the knowledge base and try again.</p>
<p><img src="/images/2024-01-08-create-ai-bot-in-azure/fixed-question.jpg" alt="Adding a custom question" /></p>
<p>This looks much better. But it does imply that quite a lot of user testing and customisation will have to take place before this Bot is ready to replace its human counterparts.</p>
<h2 id="guardrails">Guardrails</h2>
<p>One of the things that surprised people about ChatGPT, particularly in its earlier iterations, was that it was not trained to be accurate. It was trained to please the user. This would mean it would return inaccurate answers above telling you that it didn’t know the answer, as it had gauged higher satisfaction from “lying”! You don’t want your HR chatbot to lie, so you must use the guardrail settings to ensure that it does not. With ChatGPT, guardrails can be set using natural language, for example you can state:</p>
<pre><code class="language- ">{"role": "system", "content": "Assistant is an intelligent chatbot designed to help users answer their tax related questions.`
Instructions:
- Only answer questions related to taxes.
- If you're unsure of an answer, you can say "I don't know" or "I'm not sure" and recommend users go to the IRS website for more information. "},
{"role": "user", "content": "When are my taxes due?”}
</code></pre>
<p>This configuration will prevent the chatbot from “making up” an answer if it cannot find a decent response in its repository.
Configuring Azure’s ChatGPT chatbot via the GUI, to achieve the above you turn the setting known as “temperature” down to 0. The temperature represents how creative the chatbot can be in getting you an answer. A low temperature results in more “I’m sorry I don’t know” type answers, but increases the chances you’ll get an accurate answer, and that you’ll get the same answer when you ask the same question twice!</p>
<h2 id="the-cost">The Cost</h2>
<p>So what does this cost to run in Azure? Depending on your Bot type, the cost can vary wildly. As mentioned, I am running my Language instance and my Bot instance in the free trial tier, so I am only paying for the app service to host them and this is around £75/month. If you were to use an enterprise ChatGPT Bot, costs are over £800/month fixed rate for 40 users, plus 80p per “usage unit” and £20 for any extra users over and above the plan. Still considerably cheaper than making your HR staff deal with these queries, I suppose..
As mentioned, Azure AI search is priced per GB of data indexed, the free tier runs up to 50 GB, Standard tier gives you 25 GB for 27p/hour.</p>
<h2 id="in-conclusion">In Conclusion</h2>
<p>I am impressed with the Azure AI search offering - it’s powerful and useful - there are so many scenarios whereby we end up awash with documentation and cannot find the content we need. The chatbots are a varied bunch but I liked the way you could download the code and run/edit it locally with relative ease. In all, I feel this will be a very common architecture for the business problems of the next year or so.</p>
<p><a href="https://capgemini.github.io/cloud/create-ai-bot-in-azure/">How to (maybe) replace your HR department in 3 easy steps</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on January 19, 2024.</p>https://capgemini.github.io/cloud%20native/spring-cloud-vault-kubernetes2023-07-07T00:00:00+01:002023-07-07T00:00:00+01:00Greg Wolversonhttps://capgemini.github.io/authors#author-greg-wolverson
<p>I have previously written <a href="https://capgemini.github.io/engineering/securing-spring-boot-config-with-kubernetes/">blog posts</a> about securing Spring Boot configuration with standard Kubernetes resources. In this post I’m going to take it a step further with a more productionised pattern of securing Spring Boot microservices with Vault in Kubernetes.</p>
<h2 id="keep-it-secret-keep-it-safe">Keep It Secret, Keep It Safe</h2>
<p>As a famous wizard once said; <em>keep it secret, keep it safe</em>. Whilst this applies to rings and other precious objects, it also applies to the sensitive data that we keep within our applications.</p>
<p>Security is paramount in productionised applications, often being one of the more challenging patterns to implement correctly.</p>
<h2 id="not-all-secret-stores-were-created-equal">Not All Secret Stores Were Created Equal</h2>
<p>As I spoke about in my <a href="https://capgemini.github.io/engineering/securing-spring-boot-config-with-kubernetes/#keeping-secrets">previous post</a>, using Kubernetes secrets for storing sensitive data is considered bad practice for two main reasons:</p>
<ol>
<li>The secrets themselves are stored in base64 format, which provides minimal security on its own.</li>
<li>By default, secrets are <a href="https://kubernetes.io/docs/concepts/configuration/secret/">stored unencrypted</a> in the underlying API’s data store (etd), meaning anyone with API access can retrieve and modify them.</li>
</ol>
<p>There are several alternatives to using Kubernetes default secrets, and one of the most widely used tools is <a href="https://www.vaultproject.io/">HashiCorp Vault</a>. Vault is an identity-based secrets and encryption management system, that provides encryption services protected by authentication and authorization mechanisms. This makes it a much more secure way to store sensitive data. Additionally, Vault offers integration and authentication mechanisms <a href="https://developer.hashicorp.com/vault/docs/auth/kubernetes">with Kubernetes</a> out-of-the-box, providing a proven and secure approach to managing secrets within your Kubernetes cluster.</p>
<h2 id="secure-doesnt-mean-complex">Secure Doesn’t Mean Complex</h2>
<p>Whilst being a challenging pattern to get right, security doesn’t need to be complex. Let’s walk through a simple example of how to set up the Kubernetes auth method locally, and retrieve secrets from a Spring Boot application using <a href="https://cloud.spring.io/spring-cloud-vault/reference/html/">Spring Cloud Vault</a>.</p>
<h3 id="configuring-vault">Configuring Vault</h3>
<p>To begin with, we will configure Vault locally. HashiCorp has a <a href="https://helm.releases.hashicorp.com/">set of available helm charts</a> that you can apply, in order to test and work with Vault. For our example, we will be using the <a href="https://github.com/hashicorp/vault-helm">vault helm chart</a>.</p>
<p>We will use <a href="https://github.com/Praqma/helmsman#what-is-helmsman">Helmsman</a> to manage our helm deployments. If you are interested in learning more about Helmsman, I recently wrote a <a href="https://capgemini.github.io/kubernetes/introduction-to-helmsman/">blog post</a> about it.</p>
<p>Our <code class="language-plaintext highlighter-rouge">dev</code> state file looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helmRepos:
vault: https://helm.releases.hashicorp.com
apps:
...
vault:
namespace: dev
enabled: true
chart: vault/vault
version: 0.24.1
valuesFile: values/vault/values-dev.yaml
</code></pre></div></div>
<p>This will tell Helmsman to deploy the <code class="language-plaintext highlighter-rouge">vault</code> helm chart into our local dev namespace using the values file located at <code class="language-plaintext highlighter-rouge">values/vault/values-dev.yml</code>. The values file contains some simple overriding configuration to enable <a href="https://github.com/hashicorp/vault-helm/blob/main/values.yaml#L746">development mode</a> for Vault. In doing so, it allows us to experiment with Vault without needing to unseal or store keys against it (Note: This should not be done in a production environment).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>server:
dev:
enabled: true
</code></pre></div></div>
<h4 id="kube-auth-method">Kube Auth Method</h4>
<p>After applying the helmsman state file, we can proceed with configuring the Vault instance. There are two main ways to configure Vault: through the Vault UI or programatically via the CLI. Since we prefer repeatable processes, having our Vault configuration in code is a better approach. Taking it a step further, we could use the <a href="https://registry.terraform.io/providers/hashicorp/vault/latest/docs">Vault Terraform</a> approach to treat this configuration as infrastructure-as-code. However, that goes beyond the scope of this example.</p>
<h4 id="configuration">Configuration</h4>
<p>Next we need to enable the Kube auth method. The easiest way to do this (programatically) is via the Vault CLI (which comes pre-installed in the Vault container from the installed helm chart).</p>
<p><code class="language-plaintext highlighter-rouge">vault auth enable kubernetes</code></p>
<p>After enabling this feature, we need to configure the auth method to work with our local kubernetes cluster. There are several ways to configure this, but with the <a href="https://developer.hashicorp.com/vault/docs/auth/kubernetes#kubernetes-1-21">changes introduced in Kubernetes 1.21</a>, there are some documented and recommended approaches. It’s worth reading through the different approaches and understanding their differences. However, for the purpose of this example, we will be using a <a href="https://developer.hashicorp.com/vault/docs/auth/kubernetes#use-local-service-account-token-as-the-reviewer-jwt">local service account as the reviewer JWT</a> because we have Vault running locally in a pod within our cluster.</p>
<p>To enable this configuration, we can run the following command:</p>
<p><code class="language-plaintext highlighter-rouge">vault write auth/kubernetes/config kubernetes_host=https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT</code></p>
<p>This configures the Vault auth method to use the service account token running in the Vault pod itself. This works because the Vault pod is running in our local cluster, the <em>same</em> cluster that Vault will be authenticating against later on when we send requests from our sample service. If Vault was running as an externally managed service (which is typical in a production environment), this approach wouldn’t work, and we’d have to configure the auth method using a more robust approach, such as <a href="https://developer.hashicorp.com/vault/docs/auth/kubernetes#use-the-vault-client-s-jwt-as-the-reviewer-jwt">using the Vault client’s JWT as the reviewer token</a> or possibly <a href="https://developer.hashicorp.com/vault/docs/auth/kubernetes#use-the-vault-client-s-jwt-as-the-reviewer-jwt">using long-lived tokens</a>.</p>
<h4 id="roles">Roles</h4>
<p>Now that we have enabled and configured our auth method, we can proceed to add the other important pieces of configuration. Firstly, we need to configure the role against the authentication method. We will create a role that allows our Spring Boot application to retrieve secrets from our Vault instance.</p>
<p>`
vault write auth/kubernetes/role/demo bound_service_account_names=’*’ bound_service_account_namespaces=dev policies=spring-boot-demo<br />
`</p>
<p>The role above will be called <code class="language-plaintext highlighter-rouge">demo</code>, and it will be bound to any service account (for finer grained security you would usually limit this to a specific account), it will be bound to our <code class="language-plaintext highlighter-rouge">dev</code> namespace and will have a policy attached to it named <code class="language-plaintext highlighter-rouge">spring-boot-demo</code> (more on this later).</p>
<p>Each Kubernetes auth method can have any number of roles created against it. The purpose of these roles is to restrict each integrating service to a specific set of secrets through roles and policies. The <code class="language-plaintext highlighter-rouge">role</code> component of this configuration determines which service(s), bound to which service account(s) can authenticate against this method (the auth aspect). The attached <a href="https://developer.hashicorp.com/vault/docs/concepts/policies">policy</a> determines which secrets that service account(s) (and consequently service(s)) can access.</p>
<h4 id="policies">Policies</h4>
<p>Vault policies define the fine grained, path based access to specific secrets held within Vault itself. The policy we’re using for this example looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>path "kv/spring-boot-demo" {
capabilities = ["read"]
}
path "kv/spring-boot-demo/dev" {
capabilities = ["read"]
}
path "auth/token/lookup-self" {
capabilities = ["read"]
}
path "auth/token/create" {
capabilities = ["create", "read", "update", "list"]
}
</code></pre></div></div>
<p>This policy will give access to secrets held at <code class="language-plaintext highlighter-rouge">kv/spring-boot-demo</code> and <code class="language-plaintext highlighter-rouge">kv/spring-boot-demo/dev</code>, it also has some default Vault policies which allows the JWT token lookup to occur during login and authentication. For secret lookups, we only need to provide <code class="language-plaintext highlighter-rouge">read</code> access because our service will only be trying to <code class="language-plaintext highlighter-rouge">get</code> specific secrets, not create or update them.</p>
<p>The following diagram gives a high-level view as to how Policy look-ups and authorisation occur.</p>
<p><img src="/images/2023-06-21-spring-cloud-vault-kubernetes/vault-policy.png" alt="Vault policy access" /></p>
<h4 id="secrets">Secrets</h4>
<p>Lastly, we need to enable a secrets engine and create a secret for our application to use. For this example, we will be using the <a href="https://developer.hashicorp.com/vault/docs/secrets/kv">Key-Value secrets engine</a>. The following CLI command will enable the KV engine for us, with a name of <code class="language-plaintext highlighter-rouge">kv</code> (this should look familiar from our policy outlined earlier).</p>
<p><code class="language-plaintext highlighter-rouge">vault secrets enable kv</code></p>
<p>Next, we can put a secret into our new kv store:</p>
<p><code class="language-plaintext highlighter-rouge">vault kv put kv/spring-boot-demo/dev admin=password</code></p>
<p>Now that we have our Vault instance configured with the kube auth method, a role, an appropriate policy and secret data, we can integrate a sample application to test it.</p>
<h3 id="spring-cloud-vault">Spring Cloud Vault</h3>
<p>To test our Vault configuration and close the loop with our example setup, we will use a Spring Boot microservice, which has endpoint security configured with Spring Security. For this demo, we will be using <a href="https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html">actuator</a> which only <a href="https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html#actuator.endpoints.security">exposes <code class="language-plaintext highlighter-rouge">/health</code> by default for security reasons</a>. Let’s expose some actuator endpoints that could contain sensitive information such as <code class="language-plaintext highlighter-rouge">/env</code> and <code class="language-plaintext highlighter-rouge">/heapdump</code>, and secure them with spring security.</p>
<h4 id="securing-our-endpoints">Securing Our Endpoints</h4>
<p>In Spring Boot it’s fairly straightforward to enable various actuator endpoints. Spring provides a <code class="language-plaintext highlighter-rouge">management</code> config block, which allows developers finer-grained control over which endpoints are exposed, and also which sub-sets of information are exposed at those levels.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>management:
endpoint:
...
env:
enabled: true
heapdump:
enabled: true
</code></pre></div></div>
<p>We will be enabling the <code class="language-plaintext highlighter-rouge">env</code> and <code class="language-plaintext highlighter-rouge">heapdump</code> endpoints as mentioned above, this means we <em>could</em> be exposing sensitive information about our service if not secured correctly.</p>
<p>In order to secure the actuator routes properly we need to implement spring security. A simple pattern I like to follow is to split my routes into secure and insecure, allowing pass-through traffic for any non-secure route, and then handling secure routes with appropriate <a href="https://auth0.com/docs/manage-users/access-control/rbac">role-based access controls</a>. Our configuration will look like as follows, including a ‘management’ style user for access purposes.</p>
<p><code class="language-plaintext highlighter-rouge">application.yml</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>appsecurity:
management:
username: ADMIN
password: ${admin:test}
securedroutes:
management:
- "/actuator/shutdown"
- "/actuator/loggers/**"
- "/actuator/heapdump"
- "/actuator/env"
unprotected:
- "/actuator/info"
- "/actuator/prometheus"
- "/actuator/health/**"
- "/hello"
</code></pre></div></div>
<p>The config above allows us to use a <a href="https://docs.spring.io/spring-boot/docs/2.0.0.M3/reference/html/howto-properties-and-configuration.html#howto-use-short-command-line-arguments">placeholder value</a> for our management user password. This is useful for unit test purposes where we don’t want to create another <code class="language-plaintext highlighter-rouge">application.yml</code> test resource file. If we don’t supply a value at runtime, the default value of <code class="language-plaintext highlighter-rouge">test</code> will be used.</p>
<p>In order for our application to use this configuration, simple configuration properties can be used to map the values to a configuration class:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@Getter
@Setter
@Configuration
@ConfigurationProperties(prefix = "securedroutes")
public class SecuredRoutesConfig {
private String[] management;
private String[] unprotected;
}
</code></pre></div></div>
<p>Our Spring security config will configure our application to allow any requests accessing non-secure routes to pass-through without any auth checks, whereas any requests to our secured routes will be subject to authentication and authorisation checks. An example of this config is show below.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>private final SecuredRoutesConfig securedRoutesConfig;
@Value("${appsecurity.management.username}")
private String managementUsername;
@Value("${appsecurity.management.password}")
private String managementPassword;
@Bean
public PasswordEncoder encoder() {
return new BCryptPasswordEncoder();
}
@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http
.formLogin().disable()
.csrf().disable()
.authorizeHttpRequests((requests) -> requests
.requestMatchers(securedRoutesConfig.getUnprotected()).permitAll()
.requestMatchers(securedRoutesConfig.getManagement()).hasRole(ROLE_MANAGEMENT_USER)
)
.httpBasic(withDefaults());
return http.build();
}
@Bean
public UserDetailsService userDetailsService() {
UserDetails user =
User.builder()
.username(managementUsername)
.password(encoder().encode(managementPassword))
.roles(ROLE_MANAGEMENT_USER)
.build();
return new InMemoryUserDetailsManager(user);
}
</code></pre></div></div>
<h4 id="configuring-vault-1">Configuring Vault</h4>
<p>In order to configure our Spring Boot service to integrate with Vault, we need two key parts; the Spring Cloud Vault library and our application configuration to integrate with Vault itself.</p>
<p>Adding the following library to the POM file gives us the full spring-cloud-vault implementation:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code><dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-vault-config</artifactId>
</dependency>
</code></pre></div></div>
<p>And the following configuration enables our application to integrate with Vault:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>spring:
config:
import: optional:vault://
cloud:
vault:
enabled: ${vault-enabled:false}
application-name: spring-boot-demo
connection-timeout: ${vault-connection-timeout:5000}
read-timeout: ${vault-read-timeout:15000}
authentication: KUBERNETES
kv:
backend: kv
enabled: true
profile-separator: '/'
application-name: spring-boot-demo
default-context: spring-boot-demo
profiles: dev
</code></pre></div></div>
<p>Some of the configuration above might already start to make sense based on how we configured our Vault instance earlier. The main aspects to point out are the <code class="language-plaintext highlighter-rouge">kv</code> engine configuration;</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">backend: kv</code> - this tells Spring Boot the name of the kv secrets engine to lookup in Vault</li>
<li><code class="language-plaintext highlighter-rouge">profile-separator: '/'</code> - this tells Spring Boot the path separator used in the secrets engine, e.g. <code class="language-plaintext highlighter-rouge">kv/</code></li>
<li><code class="language-plaintext highlighter-rouge">application-name: spring-boot-demo</code> - this tells Spring Boot the naming convention of the secret lookup, e.g. kv/spring-boot-demo</li>
<li><code class="language-plaintext highlighter-rouge">profiles: dev</code> - this refers to the active profile Spring Boot is running, as Spring Cloud Vault uses that profile to determine the secret path to use, so dev would give us <code class="language-plaintext highlighter-rouge">kv/spring-boot-demo/dev</code>.</li>
</ul>
<h3 id="bringing-it-all-together">Bringing It All Together</h3>
<p>Given that we have a local Vault instance set up and a Spring Boot service to integrate with it, we can deploy our app and test the successful retrieval of secrets to secure our application.</p>
<p>Firstly, we will add our Spring Boot service to our Helmsman desired state file.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apps:
spring-boot-demo:
namespace: dev
enabled: true
chart: '../service-helm-chart'
version: 1.0.0
valuesFile: values/service/values-dev.yaml
vault:
...
</code></pre></div></div>
<p>Then we can apply the updated state file:</p>
<p><code class="language-plaintext highlighter-rouge">helmsman --apply -f dev.yaml</code></p>
<p>Once the new Spring Boot service is running successfully, we can test the actuator endpoint has been secured properly with our secret we set up in Vault.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> kubectl get deploy -n dev
NAME READY UP-TO-DATE AVAILABLE AGE
vault-agent-injector 1/1 1 1 64s
spring-boot-vault-demo 1/1 1 1 65s
</code></pre></div></div>
<p>We can <a href="https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/#forward-a-local-port-to-a-port-on-the-pod">port-forward</a> to the running pod to establish a localhost connection and conduct some basic cURL tests. When calling a secure endpoint without any authentication using cURL, we should receive a 401 response.</p>
<p><code class="language-plaintext highlighter-rouge">kubectl port-forward deploy/spring-boot-vault-demo 8080:8080 -n dev</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> curl http://localhost:8080/actuator/env -v
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /actuator/env HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 401
...
</code></pre></div></div>
<p>Now, when we use cURL to call the same endpoint while providing the authentication secret stored in Vault, we should receive a 200 response, along with the JSON payload that outlines the environment properties stored in the service.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>> curl http://localhost:8080/actuator/env --user ADMIN:password -v
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
* Server auth using Basic with user 'ADMIN'
> GET /actuator/env HTTP/1.1
> Host: localhost:8080
> Authorization: Basic QURNSU46cGFzc3dvcmQ=
> User-Agent: curl/7.79.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200
...
{"activeProfiles":["dev"],"propertySources":[{"name":"server.ports","properties":{"local.server.port":{"value":"******"}}},...
</code></pre></div></div>
<p>And that’s it! All working as expected.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Keeping sensitive information secure in production systems is paramount. With the <a href="https://www.itgovernance.co.uk/blog/data-breaches-and-cyber-attacks-in-2022-408-million-breached-records">vast number of data breaches last year</a>, which caused chaos for those who fell victim, ensuring data security and mitigating attack vectors is critical for engineering robust, well-designed systems. While this post has outlined a simpler approach to integrating a Spring Boot microservice with a secrets management solution, it hopefully demonstrates that it doesn’t have to be incredibly complex to get it right.</p>
<p>You can see all the code to accompany this post <a href="https://github.com/gwolverson/vault-kubernetes-example">over on my github</a>.</p>
<p><a href="https://capgemini.github.io/cloud%20native/spring-cloud-vault-kubernetes/">Keeping Spring Boot Apps Secure With HashiCorp Vault</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on July 07, 2023.</p>https://capgemini.github.io/kubernetes/introduction-to-helmsman2023-05-22T00:00:00+01:002023-05-22T00:00:00+01:00Greg Wolversonhttps://capgemini.github.io/authors#author-greg-wolverson
<p><a href="https://kubernetes.io/">Kubernetes</a> is one of the most popular open-source container orchestration frameworks in use today. It allows you to easily deploy, scale and manage containerised applications. As your applications grow, the number of Kubernetes resources you have to manage increases, and that’s where <a href="https://helm.sh/">Helm</a> comes in. Helm is a package manager for Kubernetes, allowing you to define, install and manage complex Kubernetes clusters at scale. However, unless you want to <a href="https://helm.sh/docs/helm/helm_install/">install</a> all of your <a href="https://helm.sh/docs/topics/charts/">helm charts</a> individually (and possibly manually), there is a need for an automated, infrastructure-as-code approach. Enter <a href="https://github.com/Praqma/helmsman#what-is-helmsman">Helmsman</a>.</p>
<h2 id="the-problem">The Problem</h2>
<p>As mentioned above, in a productionised domain, the set of deployed services and their accompanying resources will grow exponentially. Even when using a package manager like Helm, the sheer amount of deployable resources and packages can become hard to manage.</p>
<p>If you have ten Helm charts to deploy, you could be running ten install and/or upgrade commands to reach the desired cluster state for any given environment. Furthermore, if you have multiple environments (dev, test, preprod, prod etc), you then have ten commands <em>per environment</em> to run - you can quickly see how this could become dificult - not to mention inefficient - to manage.</p>
<h2 id="an-introduction-to-helmsman">An Introduction to Helmsman</h2>
<p>Helmsman is a tool which allows you to define the desired state of your Kubernetes cluster in code, giving you the ability to deploy, upgrade or destroy that state in a single command. Each environment (<code class="language-plaintext highlighter-rouge">namespace</code> traditionally in Kubernetes) has its own state file, making managing versions and resources across environments much simpler.</p>
<p>As a result of Helmsman encapsulating the state of your cluster(s) in code, you can easily describe the state of any cluster by looking at the Helmsman <a href="https://github.com/Praqma/helmsman/blob/master/docs/desired_state_specification.md#helmsman-desired-state-specification">desired state file</a>. This makes it easier to manage what’s deployed, where and at which version.</p>
<h2 id="a-helmsman-story">A Helmsman Story</h2>
<p>Let’s take an example where we have a service domain which contains four microservices. Each microservice has slightly different resources requirements (CPU/Memory) and two of them are required to integrate with a database. In non-production environments (dev, test) they are not required to be highly-available, whereas in production environments (preprod, prod) they are.</p>
<h3 id="basic-helm-chart">Basic Helm Chart</h3>
<p>We’ll create a Helm <a href="https://helm.sh/docs/topics/charts/">application chart</a> that can define the Kubernetes resources required for each of our services. Our example service chart will contain some standard Kubernetes resources such as a deployment and network policy.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>metadata:
environment: replace-me
deployment:
create: false
replicas: 1
name: replace-me
image: replace me
ports:
- 8080
resources:
requests:
memory: "250Mi"
cpu: "250m"
limits:
memory: "350Mi"
cpu: "300m"
networkPolicy:
create: false
podSelector:
matchLabels:
app: replace-me
policyTypes:
- Egress
egress: {}
</code></pre></div></div>
<p>The above is heavily simplified from what a real production chart may look like, but the purpose here is just to give an example to work from later.</p>
<p>Above you can see a <code class="language-plaintext highlighter-rouge">create: false</code> property on each resource, this is a practice I tend to follow when building Helm library charts, as it gives implementing charts the ability to opt-in to whichever resources they need, and not just get them implemented by default.</p>
<h3 id="microservice-setup">Microservice Setup</h3>
<p>Each microservice will have it’s own implementation of the base chart shown above. Let’s first use microservice-a as an example, which has no extra resource requirements, and no database connectivity.</p>
<p>Chart.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>---
apiVersion: v2
name: service-a
description: Chart for microservice A
version: 0.1.0
dependencies:
- name: base
version: 1.0.0
repository: "@base-repository"
</code></pre></div></div>
<p>values.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>base:
deployment:
create: true
replicas: 1
name: service-a
image: service-a:1.0.0
</code></pre></div></div>
<p>As you can see above, microservice-a has a very simple implementation of the base chart, mostly using the default values provided.</p>
<p>Now let’s look at microservice-b. This service will have slightly higher resource requirements and will also need egress networking out to a MySQL database (running in a pod in the same namespace).</p>
<p>Chart.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>---
apiVersion: v2
name: service-b
description: Chart for microservice B
version: 0.1.0
dependencies:
- name: base
version: 1.0.0
repository: "@base-repository"
</code></pre></div></div>
<p>values.yaml</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>base:
deployment:
create: true
replicas: 1
name: service-b
image: service-b:1.0.0
resources:
requests:
memory: "500Mi"
cpu: "350m"
limits:
memory: "550Mi"
cpu: "400m"
networkPolicy:
create: true
podSelector:
matchLabels:
app: service-b
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: mysql
</code></pre></div></div>
<h3 id="helmsman-implementation">Helmsman Implementation</h3>
<p>Now let’s look at the Helmsman implementation and how it makes dealing with multi-service deployments simpler.</p>
<p>Our very simple Helmsman folder structure will look as follows (showing only service-a and service-b for brevity):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.
├── dev.yaml
├── test.yaml
|── preprod.yaml
|── prod.yaml
└── values
├── service-a
└── values-dev.yaml
└── values-test.yaml
└── values-preprod.yaml
└── values-prod.yaml
├── service-b
└── values-dev.yaml
└── values-test.yaml
└── values-preprod.yaml
└── values-prod.yaml
</code></pre></div></div>
<p>Let’s look at a desired state file and one of the values files for each service in a bit more detail to show what’s happening.</p>
<p>As mentioned previously, Helmsman provides a way of describing the desired state for your Kubernetes cluster. In the example we’re using, we’ve got two clusters; non-production (containing dev and test namespaces) and production (containing preprod and prod namespaces).</p>
<p>Let’s take a look at the <code class="language-plaintext highlighter-rouge">dev.yaml</code> state file;</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>metadata:
description: Desired State File for the dev environment
namespaces:
dev:
helmRepos:
stable: http://custom-helm-repo-example.com
apps:
service-a:
namespace: dev
enabled: true
chart: stable/service
version: 1.0.0
valuesFile: values/service-a/values-dev.yaml
service-b:
namespace: dev
enabled: true
chart: stable/service
version: 1.0.0
valuesFile: values/service-b/values-dev.yaml
</code></pre></div></div>
<p>There’s a few bits going on in the above state file definition, so let’s break it down.</p>
<p>The <code class="language-plaintext highlighter-rouge">namespaces</code> property allows you to define the namespace(s) you have or want as part of this state definition. If the namespace(s) don’t exist when you run Helmsman, it will <a href="https://github.com/Praqma/helmsman/blob/master/docs/how_to/namespaces/create.md#create-namespaces">create them</a> for you.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>namespaces:
dev:
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">helmRepos</code> property allows you to <a href="https://github.com/Praqma/helmsman/blob/master/docs/how_to/helm_repos/default.md">define the Helm repositories</a> where your packaged charts are stored. There are several options for chart repositories, such as; default, private (backed by Google, AWS or basic auth) and local.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>helmRepos:
stable: http://custom-helm-repo-example.com # This doesn't exist, it's just shown for example purposes
</code></pre></div></div>
<p>The <a href="https://github.com/Praqma/helmsman/blob/master/docs/desired_state_specification.md#apps">apps</a> block is the most important block within the example state file shown above, it defines <em>all</em> the services you want deploying as part of this state file. Helmsman is very powerful and provides a lot of configuration options for <a href="https://github.com/Praqma/helmsman/tree/master/docs/how_to">deploying apps and configuring them</a>. In the example above, we’re using a simple app definition for each service.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>apps:
service-a:
namespace: dev
enabled: true
chart: stable/service
version: 1.0.0
valuesFile: values/service-a/values-dev.yaml
</code></pre></div></div>
<p>An important property defined above is the <code class="language-plaintext highlighter-rouge">valuesFile</code> property, this tells Helmsman where the values file to be installed as part of this release is located within the Helmsman structure.</p>
<p>As displayed previously, our Helmsman file structure contains the following files;</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>└── values
├── service-a
└── values-dev.yaml
├── service-b
└── values-dev.yaml
</code></pre></div></div>
<p>So when we’re specifying the <code class="language-plaintext highlighter-rouge">valuesFile</code> property as <code class="language-plaintext highlighter-rouge">values/service-a/values-dev.yaml</code> it’s referring to the following folder</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>└── values
├── service-a
└── values-dev.yaml
</code></pre></div></div>
<p>Now let’s look at the contents of those files - this is where the modularisation within Helmsman really shines.</p>
<p>Earlier on we stated that Service A doesn’t have any additional requirements beyond the standard chart specification. Whereas Service B had the additional requirements of higher resources and a connection to a MySQL database. With that being said, let’s look at the <code class="language-plaintext highlighter-rouge">values-dev.yaml</code> definition for these services</p>
<h4 id="service-a">Service A</h4>
<p>Service A <em>only</em> needs to specifiy the environment it sits within and some basic information about the deployment; name, image and container port, everything else is already defined in the base service chart that we’re using (as defined in the Helmsman <code class="language-plaintext highlighter-rouge">dev.yaml</code> state file).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>metadata:
environment: dev
deployment:
create: true
name: service-a
image: service-a:1.0.0
containerPort: 8080
</code></pre></div></div>
<h4 id="service-b">Service B</h4>
<p>Service B on the other hand, needs a bit more configuration to meet requirements.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>metadata:
environment: dev
deployment:
create: true
name: service-b
image: service-b:1.0.0
containerPort: 8080
resources:
requests:
memory: "500Mi"
cpu: "350m"
limits:
memory: "550Mi"
cpu: "400m"
networkPolicy:
create: true
podSelector:
matchLabels:
app: service-b
policyTypes:
- Egress
egress:
- to:
- podSelector:
matchLabels:
app: mysql
</code></pre></div></div>
<p>For the Service B <code class="language-plaintext highlighter-rouge">values-dev.yaml</code> file we have specified the environment, deployment and networkPolicy configuration values. This has allowed us to <em>override</em> and add to the values that are defined in the base service chart we’re using as part of this deployment.</p>
<p>As our project grows, we can easily add more services to our desired state file(s), making the management of our environments much simpler than if we had to manage all the helm charts individually.</p>
<h3 id="bringing-it-all-together">Bringing It All Together</h3>
<p>So now we have our example Helmsman project setup, with our desired state file(s) ready to provision services into our cluster. All we need to do now is issue certain Helmsman commands and we’ll have our services running in no time. Ideally, you’d <a href="https://github.com/Praqma/helmsman/blob/master/docs/how_to/deployments/ci.md#run-helmsman-in-ci">run Helmsman from CI pipelines</a>, but that goes beyond the scope of this post. We’ll now take a look a few of the more widely used commands.</p>
<h4 id="dry-run">Dry Run</h4>
<p>A <em>really</em> useful feature of Helmsman is the ability to use <code class="language-plaintext highlighter-rouge">dry-run</code>. This allows you to point Helmsman at one of your desired state files and do a dry-run installation against your cluster. The benefit of this is you get to see the rendered Kubernetes manifests that would be installed, and can easily verify and validate that the manifests to be installed are correct, without them actually being installed.</p>
<p><code class="language-plaintext highlighter-rouge">helmsman -f dev.yml --dry-run</code></p>
<h4 id="apply">Apply</h4>
<p>Next up is the <code class="language-plaintext highlighter-rouge">apply</code> command. This applies your desired state file to your kubernetes cluster, installing all the resources via Helm.</p>
<p><code class="language-plaintext highlighter-rouge">helmsman -f dev.yml --apply</code></p>
<h4 id="destroy">Destroy</h4>
<p>Another useful command is the <code class="language-plaintext highlighter-rouge">destroy</code> command. This tears down your cluster based on the desired state file - this is useful if you want to tear down environments quickly or nightly to save costs.</p>
<p><code class="language-plaintext highlighter-rouge">helmsman -f dev.yml --destroy</code></p>
<h2 id="wrapping-up">Wrapping Up</h2>
<p>Although this post has only shown a very simple example project, hopefully you can see how Helmsman is a very useful tool for managing our Kubernetes environments. As service domains grow, so do the amount of resources we need to keep track of and implement to keep everything ticking along. Rather than trying to keep a handle on all of those resources manually, it’s better to leverage specific tooling (like Helmsman) to provide consistency, efficiency and a much better developer experience!</p>
<p>Helmsman is just one approach to managing your kubernetes environments, and is a good entryway to more GitOps style approaches such as <a href="https://fluxcd.io/">FluxCD</a> or <a href="https://argo-cd.readthedocs.io/en/stable/">ArgoCD</a> (among others).</p>
<p>You can see all the code for an example service scenario like the one described in this post <a href="https://github.com/gwolverson/helmsman-demo">over on my github</a>.</p>
<p><a href="https://capgemini.github.io/kubernetes/introduction-to-helmsman/">Navigating Kubernetes Deployments With Helmsman</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 22, 2023.</p>https://capgemini.github.io/development/preparing-for-devoxx2023-05-10T00:00:00+01:002023-05-10T00:00:00+01:00Sarah Saundershttps://capgemini.github.io/authors#author-sarah-saunders
<p>It’s a big deal preparing for sponsorship of a conference. Each year, the Cloud Development team at Capgemini are proud to sponsor <a href="https://www.devoxx.co.uk">Devoxx UK</a>, the leading developer conference in Britain. What does this involve? A lot more creativity than you might think!</p>
<h2 id="theme">Theme</h2>
<p>Capgemini is a huge global company with global annual goals and missions. We are a relatively small team of 100 or so UK-based software engineers, so aligning our goals with the wider company can be something of a challenge to start with! This year, Capgemini’s purpose was a good start for us: “Unleashing human energy through technology for an inclusive and sustainable future”. The Capgemini brand platform of “Get the future you want” is also good for Devoxx - part of the reason we’re sponsoring the conference is to remind people that we’re always recruiting for new talent and can offer a great place for a software engineer to work and develop themselves.</p>
<p>We knew we wanted to focus on sustainability as it’s one topic close to all our hearts as well as being a <a href="https://www.capgemini.com/be-en/about-us/csr/environmental-sustainability">Capgemini goal</a>. But we didn’t want to make people feel depressed or personally responsible - we wanted to inspire them to make a difference. I recently attended a Capgemini “<a href="https://climatefresk.org/">Climate Fresk</a>” workshop, and whilst it was educational, it was mainly terrifying! The concept of a “Fresk” is that it works rather like a round table discussion, there are a number of prompt cards around environmental topics such as carbon dioxide levels, deforestation, rising sea levels, weather pattern disruption, sea water acidity levels, plant and animal diversity, CFCs, forest fires, population migration. The cards could be positioned to show which events affected which other topics, and so during the hour we built up a map of the effects of fossil fuel extraction on our planet. Most of the information I had seen before but I did learn and relearn a couple of things, for example as more CO<sub>2</sub> is absorbed by the ocean, the pH of the sea water rises and the sea becomes more acidic. This makes it difficult for small sea creatures to form shells, because their shells are made of alkaline calcium which dissolves in more acidic waters. These tiny creatures are the base of the food chain in the sea and so depletion in their numbers has a massive effect on the population of larger sea creatures. I came out of the fresk feeling informed but scared - not an emotion we want people to associate with Capgemini! But neither do we want to shy away from the problems that burning fossil fuels are causing. Instead, we decided to focus on the positives. This year’s <a href="https://www.capgemini.com/insights/research-library/technovision-2023/">TechnoVision</a> report contains some powerful and brave messages for Capgemini employees - for example, “Do more with less”. This is very often not the easy route, and for Capgemini perhaps not the most profitable route either, it takes courage to tell your clients that the best route forward is not to build any software at all! But we need to recognise that sometimes this is the right answer. Wasteful technology is something we can all do without.</p>
<p>Taking that as a starting point, what about useful tech? How can we make it more carbon efficient? We know that great strides have been made towards generating carbon neutral electricity, for example from renewable energy sources such as wind, waves and sun. We know that many European countries generate quantities of electricity that way when they can - but it depends on factors such as hours of sunlight, wind speed, river levels. How can we know whether the electricity our application farms are using is generated from renewable sources? Turns out there is a way. <a href="https://www.energymonitor.ai/">The Energy Monitor website</a> collects data from 27 European countries (many of which host data centres for major cloud providers such as Amazon and Azure) so that you can see, for a given point in time, which country is producing the most electricity from renewables. What if you could use this information to move your applications to the data centre using the most “green” electricity? Now that’s inspiring!</p>
<p>A lot of what we do as Cloud developers gives us opportunities to make electricity savings by reducing the amount of compute power we use. The fantastic advantage of intrastructure-as-code is that you can safely tear down huge proportions of your infrastructure when you’re not using it - for example, only have your build pipelines running when you actually have something to build. We are contributing to a Capgemini “Green Book” of practices that we can share with our clients, to help them reduce their carbon footprints with minimal impact to their businesses.</p>
<h2 id="swag">Swag</h2>
<p>OK so now we have a phrase to print on our stand (“Get the future you want”), and a theme. What next? We want something to give away that is useful and reusable, that isn’t plastic, that’ll remind people of meeting us. We need some cotton T-shirts! <a href="https://capgemini.github.io/development/the-efficient-cloud-era/">Last year</a> we brought along 40 or so of our Capgemini / Ada Lovelace “I am a Role Model” T-shirts and they went like hot cakes. Unfortunately the million tonnes of paper notepads and sweets that we brought along didn’t go down so well - turns out devs don’t write much and are rather healthier than we’d given them credit for! So we know that T-shirts are the way to go. But what picture can we put on them, that developers attending the conference will want to wear?
Recently, Capgemini opened a new “Delivery Centre” in our Holborn office. This is a step back to teams working face to face, appreciating the value of getting people together. For the centre opening, our resident artist had drawn some fantastic images of an octopus busily multi-tasking that had got a lot of attention. Octopuses have started popping up all over the delivery centre and we thought we’d get involved! So we stole an octopus for our T-shirt.</p>
<p><img src="/images/2023-04-28-preparing-for-devoxx/Devoxx_Tshirt_Blank.jpeg" alt="T-shirt design with octopus" /></p>
<p>All we needed now was some text to go with it. I turned to my team to get the best octopus puns and they didn’t disappoint.</p>
<ul>
<li>“Be INK-redibly productive”</li>
<li>“Don’t be a sucker! Develop at octo-speed”</li>
<li>“8-bit computing”</li>
<li>“Octo-pushing delivery forward”</li>
<li>“Un-LIMB-ited potential”</li>
<li>“Kraken’ on with development”</li>
<li>“This is what beak performance looks like”</li>
<li>“Be an octo-coder”</li>
</ul>
<p>OK thanks guys. Enough already. We decided that the un-LIMB-ited potential slogan fitted really well with our concept of “Get the future you want”, and so our T-shirt is a wrap! Now we can start to look forward to the talks on offer at the conference and up-skilling ourselves for the year ahead. Roll on May 10th!</p>
<p><a href="https://capgemini.github.io/development/preparing-for-devoxx/">Preparing for Devoxx</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 10, 2023.</p>https://capgemini.github.io/frontend/modern_frontends_live_uncut_gems2023-04-14T00:00:00+01:002023-04-14T00:00:00+01:00Julie Vaccalluzzohttps://capgemini.github.io/authors#author-julie-vaccalluzzo
<p><img src="/images/2022-11-17_Modern_frontends_live.jpg" alt="modern frontends live main stage" /></p>
<p>It wasn’t the best conference that I have been to. A lot of the speakers complained in public about the lack of organisation. The event was not live streamed as promised, some speakers pulled out, none of it recorded and although at the Excel Centre, the location in the Excel Center was at the very end and poorly sign posted. Coffee was super gross, but I didn’t drag myself away from my comfy home office for the free lunch. The content was great. It can be hard to keep up with what is going on, unless you are knee deep in the cesspool Twitter feeds on a daily basis. Attending conferences and meetups are a great place to play catch up.
I am not sure what this post pandemic world holds for conferences. It’s nice to see a slew of conferences streaming live, although, I find it too easy to not attend or pay attention when you are not physically there.</p>
<p>Here a list of ‘uncut gems’ and people I think are well worth a follow providing Twitter doesn’t turn into a giant sink hole. Enjoy!</p>
<h2 id="things-we-should-stop-using-javascript-for">Things we should stop using JavaScript for</h2>
<h3 id="native-browser-elements">Native browser elements</h3>
<p><a href="https://speakerdeck.com/init/stop-using-js-for-that-by-kilian-valkhof-init-2022">Stop using JavaScript for that: move features from JS to CSS and HTML</a></p>
<p><a href="https://twitter.com/kilianvalkhof">Kilian Valkhof</a></p>
<p>Using native <a href="https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-reduced-motion">colour picker</a> and use <code class="language-plaintext highlighter-rouge"><input type="color"></code> instead.</p>
<p>I never thought I would see a ‘pop up’ become native after Google gave the idea the axe because it takes the user away from the content. The modal or dialogue lost favour from a UI perspective. Things change, so it’s always a good thing to know that this native colour picker exists. I can certainly see this native feature for browser based software.
<a href="https://developer.mozilla.org/en-US/docs/Web/API/HTMLDialogElement">HTMLDialogElement</a></p>
<h2 id="css-how-does-css-work">CSS How Does CSS Work?</h2>
<p><a href="https://twitter.com/eladsc">Elad Shechter</a></p>
<h3 id="reinventing-the-reset">Reinventing the reset.</h3>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/CSS/all">CSS all</a></p>
<p><code class="language-plaintext highlighter-rouge">all: unset;</code></p>
<p>Un-sets everything except the Unicode setting. No more overriding overrides of overrides. I wish I had this when I was forced to use Bootstrap for all site development.</p>
<p>Without using <em>before</em> and after <em>elements</em>, using <em>all:unset</em> to build a custom <a href="https://codepen.io/elad2412/pen/jOymRJy">checkbox that looks more like a mobile UI interface</a>.</p>
<p><code class="language-plaintext highlighter-rouge">display: revert;</code></p>
<p>Rolls back the cascaded value to the user agent’s default style.</p>
<p>Using this as a launch point may I introduce the new <a href="https://elad2412.github.io/the-new-css-reset/">CSS reset</a>. Like Reset and Normalise, <em>CSS reset</em> does a similar thing with half the code. We no longer need to specifically override every property.</p>
<p>If you want to find out more about where it came from and how it’s put together read <a href="https://elad.medium.com/the-new-css-reset-53f41f13282e">The new CSS Reset</a></p>
<h2 id="the-four-principles-of-accessibility">The Four Principles of Accessibility</h2>
<p><a href="https://twitter.com/xirclebox">Homer Gaines</a>, Certified #A11y Professional</p>
<p><a href="https://twitter.com/i/communities/1470900050029072386">Twitter accessibility communities</a></p>
<p>Refreshing perspective from across the pond. You can <a href="https://www.youtube.com/watch?v=RUxx_sq2QdY&ab_channel=UXDX">watch the presentation on YouTube</a>.</p>
<p>Tab index should never be less than 1. Negative numbers are not reachable with keyboard navigation but can be useful should you wish to make this element visible on another action.</p>
<p>HTML Semantics can sometimes be a pain. Design may want to give weight to copy when semantically it does not make sense and can hurt your SEO and page rankings not mention make a mess of your accessibility and CSS overhead.</p>
<p>Welcome <a href="https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/Roles/presentation_role">ARIA presentation</a>. This means that these elements and all their children are not exposed to assistive technologies.</p>
<h2 id="beyond-the-browser--how-to-talk-with-robots">Beyond the browser – how to talk with robots</h2>
<p><a href="https://twitter.com/nic_o_martin">Nico Martin</a></p>
<p>It is now possible to use the browsers’ native <a href="https://developer.mozilla.org/en-US/docs/Web/API/Web_Bluetooth_API">Bluetooth API</a> to connect with things to do stuff.</p>
<p>Nico used the browser to communicate via <a href="https://slides.nico.dev/221118-robots-modern-frontends-live/#/">Bluetooth to his raspberry pi driven LED matrix</a>. The feature is experimental and needs to have the flag turned on, but this is a good indicator of the direction of how the internet of things may interact with the browser, or any browser.</p>
<h2 id="the-webs-next-transition">The Web’s Next Transition</h2>
<p><a href="https://twitter.com/kentcdodds">Kent C. Dodds</a></p>
<p>If you haven’t heard of Kent C Dodds, what rock have you been living under? This next bit makes me feel like we’re going back to table design with CSS grid.</p>
<h3 id="progressively-enhanced-single-page-apps-pespa">Progressively Enhanced Single Page Apps (PESPA)</h3>
<p>This is the idea that we progressively enhance our web pages to work with JS.
So where there is no JS… we show a submit button, if JS is available, use the fetch API.</p>
<p><a href="https://remix.run/">Remix</a> is a web stack that follows this model. I can’t say I am not a fan, even if JS is pretty much ubiquitous.</p>
<p>Find out more from Kent’s <a href="https://www.epicweb.dev/the-webs-next-transition">article</a>.</p>
<h2 id="gotta-go-fast-use-web-components-with-fast">Gotta Go FAST: Use Web Components with FAST</h2>
<p><a href="https://twitter.com/WallerGoble">Waller Goble</a></p>
<p>I love Web Components, it’s a standard that that is native to the browser, but like most new browser based technologies, is still not fully supported by browsers that are not evergreen (browsers that are automatically upgraded to future versions, rather than being updated by distribution of new versions from the manufacturer). Web Component base frameworks I am familiar with, include, <a href="https://lit.dev/docs/v1/lit-html/introduction/">Lit</a> (Google) which is a polyfill to help bridge this gap, Salesforce <a href="https://developer.salesforce.com/docs/component-library/documentation/en/lwc/lwc.get_started_introduction">Lightning</a>, which does that same thing under the hood. If you are familiar with the Big Commerce platform, it too uses Web Components <a href="https://stenciljs.com/docs/introduction">Stencil</a> for most it’s components. <a href="https://www.fast.design/">FAST</a>, is the new kid on the block and a nice surprise. It’s also a surprise that this library has been developed by Microsoft.</p>
<p>Find out more about <a href="https://developer.mozilla.org/en-US/docs/Web/Web_Components#see_also">Web Components on MDN</a> and its web component libraries available.</p>
<p>Having been burnt by ActionScript as a Flash development, I worry that as Meta and Facebook become less popular platforms, Meta and it’s community support for React might fade and I would rather invest in tech that is browser standard… Like JS, it’s not perfect, but everyone uses it. Having said this, because web components are standard, they can also be integrated with your preferred framework.</p>
<h2 id="what-my-browser-can-do">What my browser can do</h2>
<p><a href="https://twitter.com/paco_ITA">Francesco Leardini</a></p>
<p>The <a href="https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API">Page Visibility API</a> provides events you can watch for to know when a document becomes visible or hidden, as well as features to look at the current visibility state of the page.</p>
<p>Instead of sniffing for unreliable devices that date, have a look at the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Media_Capabilities_API">Media Capabilities API</a></p>
<p>The <a href="https://developer.mozilla.org/en-US/docs/Web/API/Screen_Wake_Lock_API">WakeLock API</a> prevents device screens from dimming or locking when an application needs to keep running.</p>
<p>This is done by reading luminance around the host application with the Ambient light sensor.</p>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/File_System_Access_API">File system access API</a> only available with HTTPS. I see applications for this in combination with Web Machine Learning. keeping personal information local, secure and decentralised.</p>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/Navigator/share">Native Web share</a> will allow users to share data with other sites without the need for horrible 3rd party APIs.</p>
<p>Access to contacts with the <a href="https://developer.mozilla.org/en-US/docs/Web/API/Contact_Picker_API">Contact Picker API</a>.</p>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_API">WebCodecs API</a> individual frames of a video stream and chunks of audio
Enables hardware encoders/decoders via a WebCodec API. Codecs have always been a pain having to accommodate for different proprietary codecs only available to users of certain OS or software, so this is a nice advancement. Video codecs supported are AV1, AVC1, VP8, VP9 and HEVC.</p>
<p>See Francesco Leardini’s <a href="https://github.com/pacoita/modern-web">GitHub project for examples</a>.</p>
<h2 id="beyond-the-web-of-today">Beyond the web of today</h2>
<p><a href="https://twitter.com/kennethrohde">Kenneth Christiansen</a></p>
<p>Web apps should be able to do anything iOS, Android, or desktop apps can. The members of the cross-company <a href="https://www.chromium.org/teams/web-capabilities-fugu/">Capabilities Project</a> want to make it possible for you to build and deliver apps on the open web that have never been possible before. See <a href="https://fugu-tracker.web.app/">Fugu API Tracker</a> for current and future API in development for the browser
For fun, checkout <a href="https://vscode.dev/">VS Code for the Web</a>.</p>
<p>Some of the features that are worth note in browser technology.
<a href="https://developer.mozilla.org/en-US/docs/WebAssembly">Web Assembly and WASM</a> – Web Assembly is a new type of code that can be run in modern web browsers — it is a low-level assembly-like language with a compact binary format that runs with near-native performance.</p>
<p><a href="https://developer.mozilla.org/en-US/docs/Web/API/WebGL_API">WebGL</a> has been around for a while but is very much underutilised. As browsers are increasingly evergreen, using webGL is becoming more viable. WebGL (Web Graphics Library) is a JavaScript API for rendering high-performance interactive 3D and 2D graphics within any compatible web browser without the use of plug-ins. Kinda makes me angry knowing what Flash was able to do 20 years ago!</p>
<p>While I have been aware of WebGL I was not aware of <a href="https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Experimental_features">WebGPU</a>, is the working name for a future web standard and JavaScript API for accelerated graphics and compute, aiming to provide “modern 3D graphics and computation capabilities”</p>
<p><a href="https://webmachinelearning.github.io/">Web Machine Learning</a>. Arguably the most exciting thing coming to the future of the browser, or is it here now. Web Neural Network brings machine learning to the browser natively. Some interesting uses include machine translation, detecting fake video, facial recognition and emotion analysis. There are a lot of ethical considerations for this technology, and its uses, however, the there are some very good use cases for assistive technology and accessibility for disabilities.</p>
<p>Overall, conferences are good to find out where you would like to get your toes wet and there is no better time than now to start playing around and learning a new skill.</p>
<p>I think that if AI is not outlawed outright, using browser technology to enhance if not completely automate user journeys and interactions is on the cards. How will the web interact with virtual assistants to make holiday bookings and cobots (collaborative robots) to provide medication in our senior years? The future is definitely interesting. Let’s see.</p>
<p><a href="https://capgemini.github.io/frontend/modern_frontends_live_uncut_gems/">Modern Frontends live</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on April 14, 2023.</p>https://capgemini.github.io/development/pact-contract-testing-with-kafka2023-01-06T00:00:00+00:002023-01-06T00:00:00+00:00Papa Anthonyhttps://capgemini.github.io/authors#author-papa-anthony
<h2 id="the-problem">The Problem</h2>
<p>When developing microservices within a distributed system there is a need to ensure that where services communicate with one another, both the providing and consuming services understand what the other expects.</p>
<p>A common solution to this problem is end to end integration testing, where services being tested are deployed into a production-like environment at the same time and real usage scenarios are executed. This allows a relatively high level of confidence that the system and its components work together as expected, however this method of testing has the following drawbacks:</p>
<ul>
<li>It is a slow process - tests often don’t run in parallel, co-ordination between the teams developing the services can be long winded, teams may be lagging behind others in feature completion so testing can’t be fully representative.</li>
<li>Tests are fragile and are hard to debug - due to so many moving parts, test starters, env configurations, different app versions etc. tests are very brittle and difficult to debug efficiently.</li>
</ul>
<h2 id="other-solutions">Other Solutions</h2>
<p>There are solutions which help to standardise the format of messages being transferred between services, such as OpenAPI specifications or JSON schema specifications. Though useful and important, these solutions do not guarantee that breaking changes are not merged and deployed. This is because responsibility is placed on consumers to keep up to date with different provider versions as they are updated. This inevitably leads to some consumers becoming out of sync for various reasons, leading to message processing errors.</p>
<h2 id="consumer-driven-contract-testing">Consumer Driven Contract Testing</h2>
<p>Consumer driven contract testing is an alternative approach to end to end testing, the focus is on a single component and its integration boundaries at a time. The responsibility for defining the contract which needs to be adhered to is placed on the consumer. This approach alleviates many of the issues with end to end testing mentioned above:</p>
<ul>
<li>Faster - services don’t need to be deployed and can run locally or in a build pipeline so feedback on breaking changes is much faster.</li>
<li>Simpler more reliable deployments - removes the need for complicated release coordination and dependencies between teams.</li>
<li>Allows you to know statically at release time which services are compatible</li>
</ul>
<h2 id="pact-spring-boot-and-kafka">Pact, Spring Boot and Kafka</h2>
<p><a href="https://pact.io">Pact</a> is a popular open source consumer driven contract testing library. It is usually used in the context of testing between APIs and clients. However, pact can also be used to test asynchronous event driven systems. The steps for this are as follows:</p>
<ol>
<li>Test the consumer and capture the contract by using a mock provided by pact. The mock checks the consumer can successfully invoke the message handler and can successfully process the event.</li>
<li>All the contracts are serialised and loaded into a pact broker</li>
<li>Pact pulls all the consumer contracts from the pact broker, then replays them against the provider. The test verifies the provider can produce the right messages for each consumer by checking that the message structure matches what is defined in the consumer contract.</li>
</ol>
<h2 id="implementation-example">Implementation Example</h2>
<p>In the following example, we will create a simple <a href="https://github.com/PapaAAnthony/kafka-pact">NBA (National Basketball Association) contract themed Spring Boot, Maven, JUnit 5 application</a> which will implement a Kafka consumer that will generate a contract. Following which we will define a producer and see how, using Pact we can ensure that the contract between the two services is upheld. We’ll start with ensuring we have the correct <a href="https://github.com/PapaAAnthony/kafka-pact/blob/main/pom.xml">dependencies</a>.</p>
<h3 id="consumer">Consumer</h3>
<p>This consumer will listen on the specified topic for events when a new NBA player signs a contract and then generate a headline that will be logged with specific contract details pulled from the Kafka message.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Component</span>
<span class="nd">@RequiredArgsConstructor</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">PlayerContractListener</span> <span class="o">{</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">Logger</span> <span class="n">logger</span> <span class="o">=</span> <span class="nc">LoggerFactory</span><span class="o">.</span><span class="na">getLogger</span><span class="o">(</span><span class="nc">PlayerContractListener</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">HeadlineGenerator</span> <span class="n">headlineGenerator</span><span class="o">;</span>
<span class="nd">@KafkaListener</span><span class="o">(</span><span class="n">id</span> <span class="o">=</span> <span class="s">"demo"</span><span class="o">,</span> <span class="n">topics</span> <span class="o">=</span> <span class="s">"contract-details"</span><span class="o">)</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">listen</span><span class="o">(</span><span class="nd">@Payload</span> <span class="nc">ContractDetails</span> <span class="n">details</span><span class="o">)</span> <span class="o">{</span>
<span class="n">logger</span><span class="o">.</span><span class="na">info</span><span class="o">(</span><span class="s">"Contract consumed from topic!"</span><span class="o">);</span>
<span class="n">logger</span><span class="o">.</span><span class="na">info</span><span class="o">(</span><span class="n">headlineGenerator</span><span class="o">.</span><span class="na">generateHeadLine</span><span class="o">(</span><span class="n">details</span><span class="o">));</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>The pact unit test implementation for this listener is as follows:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@ExtendWith</span><span class="o">(</span><span class="n">value</span> <span class="o">=</span> <span class="o">{</span><span class="nc">PactConsumerTestExt</span><span class="o">.</span><span class="na">class</span><span class="o">,</span> <span class="nc">MockitoExtension</span><span class="o">.</span><span class="na">class</span><span class="o">})</span>
<span class="nd">@PactTestFor</span><span class="o">(</span><span class="n">providerName</span> <span class="o">=</span> <span class="s">"playerContractProducer"</span><span class="o">,</span> <span class="n">providerType</span> <span class="o">=</span> <span class="nc">ProviderType</span><span class="o">.</span><span class="na">ASYNCH</span><span class="o">,</span> <span class="n">pactVersion</span> <span class="o">=</span> <span class="nc">PactSpecVersion</span><span class="o">.</span><span class="na">V3</span><span class="o">)</span>
<span class="kd">class</span> <span class="nc">PlayerContractListenerTest</span> <span class="o">{</span>
<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="nc">String</span> <span class="no">JSON_CONTENT_TYPE</span> <span class="o">=</span> <span class="s">"application/json"</span><span class="o">;</span>
<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="nc">String</span> <span class="no">KEY_CONTENT_TYPE</span> <span class="o">=</span> <span class="s">"contentType"</span><span class="o">;</span>
<span class="nd">@Mock</span>
<span class="kd">private</span> <span class="nc">HeadlineGenerator</span> <span class="n">headlineGenerator</span><span class="o">;</span>
<span class="nd">@InjectMocks</span>
<span class="kd">private</span> <span class="nc">PlayerContractListener</span> <span class="n">playerContractListener</span><span class="o">;</span>
<span class="nd">@Pact</span><span class="o">(</span><span class="n">consumer</span> <span class="o">=</span> <span class="s">"playerContractConsumer"</span><span class="o">)</span>
<span class="nc">MessagePact</span> <span class="nf">contractDetailPact</span><span class="o">(</span><span class="nc">MessagePactBuilder</span> <span class="n">builder</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">PactDslJsonBody</span> <span class="n">jsonBody</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">PactDslJsonBody</span><span class="o">();</span>
<span class="n">jsonBody</span><span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"documentType"</span><span class="o">,</span> <span class="s">"contract"</span><span class="o">)</span>
<span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"firstName"</span><span class="o">,</span> <span class="s">"Lebron"</span><span class="o">)</span>
<span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"lastName"</span><span class="o">,</span> <span class="s">"James"</span><span class="o">)</span>
<span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"team"</span><span class="o">,</span> <span class="s">"LA Lakers"</span><span class="o">)</span>
<span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"duration"</span><span class="o">,</span> <span class="s">"5 years"</span><span class="o">)</span>
<span class="o">.</span><span class="na">stringType</span><span class="o">(</span><span class="s">"salary"</span><span class="o">,</span> <span class="s">"158 million USD"</span><span class="o">);</span>
<span class="k">return</span> <span class="n">builder</span><span class="o">.</span><span class="na">expectsToReceive</span><span class="o">(</span><span class="s">"A player contract"</span><span class="o">)</span>
<span class="o">.</span><span class="na">withMetadata</span><span class="o">(</span><span class="nc">Map</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="no">JSON_CONTENT_TYPE</span><span class="o">,</span> <span class="no">KEY_CONTENT_TYPE</span><span class="o">))</span>
<span class="o">.</span><span class="na">withContent</span><span class="o">(</span><span class="n">jsonBody</span><span class="o">)</span>
<span class="o">.</span><span class="na">toPact</span><span class="o">();</span>
<span class="o">}</span>
<span class="nd">@Test</span>
<span class="nd">@PactTestFor</span><span class="o">(</span><span class="n">pactMethod</span> <span class="o">=</span> <span class="s">"contractDetailPact"</span><span class="o">,</span> <span class="n">providerType</span> <span class="o">=</span> <span class="nc">ProviderType</span><span class="o">.</span><span class="na">ASYNCH</span><span class="o">)</span>
<span class="kt">void</span> <span class="nf">successfullyGenerateHeadlineGivenValidMessage</span><span class="o">(</span><span class="nc">List</span><span class="o"><</span><span class="nc">Message</span><span class="o">></span> <span class="n">messages</span><span class="o">)</span> <span class="o">{</span>
<span class="nc">ContractDetails</span> <span class="n">contractDetails</span> <span class="o">=</span> <span class="nc">ContractDetails</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
<span class="o">.</span><span class="na">documentType</span><span class="o">(</span><span class="s">"contract"</span><span class="o">)</span>
<span class="o">.</span><span class="na">firstName</span><span class="o">(</span><span class="s">"Lebron"</span><span class="o">)</span>
<span class="o">.</span><span class="na">lastName</span><span class="o">(</span><span class="s">"James"</span><span class="o">)</span>
<span class="o">.</span><span class="na">team</span><span class="o">(</span><span class="s">"LA Lakers"</span><span class="o">)</span>
<span class="o">.</span><span class="na">duration</span><span class="o">(</span><span class="s">"5 years"</span><span class="o">)</span>
<span class="o">.</span><span class="na">salary</span><span class="o">(</span><span class="s">"158 million USD"</span><span class="o">)</span>
<span class="o">.</span><span class="na">build</span><span class="o">();</span>
<span class="n">when</span><span class="o">(</span><span class="n">headlineGenerator</span><span class="o">.</span><span class="na">generateHeadLine</span><span class="o">(</span><span class="n">contractDetails</span><span class="o">)).</span><span class="na">thenReturn</span><span class="o">(</span><span class="s">"A new headline"</span><span class="o">);</span>
<span class="n">messages</span><span class="o">.</span><span class="na">forEach</span><span class="o">(</span><span class="n">message</span> <span class="o">-></span> <span class="o">{</span>
<span class="n">assertDoesNotThrow</span><span class="o">(()</span> <span class="o">-></span> <span class="n">playerContractListener</span><span class="o">.</span><span class="na">listen</span><span class="o">(</span>
<span class="k">new</span> <span class="nf">ObjectMapper</span><span class="o">().</span><span class="na">readValue</span><span class="o">(</span><span class="n">message</span><span class="o">.</span><span class="na">contentsAsBytes</span><span class="o">(),</span> <span class="nc">ContractDetails</span><span class="o">.</span><span class="na">class</span><span class="o">)));</span>
<span class="n">verify</span><span class="o">(</span><span class="n">headlineGenerator</span><span class="o">,</span> <span class="n">times</span><span class="o">(</span><span class="mi">1</span><span class="o">)).</span><span class="na">generateHeadLine</span><span class="o">(</span><span class="n">contractDetails</span><span class="o">);</span>
<span class="o">});</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">@ExtendWith</code> allows us to specify both the <code class="language-plaintext highlighter-rouge">PactConsumerTestExt</code> and the <code class="language-plaintext highlighter-rouge">MockitoExtention</code> to initialise our Mockito/Pact annotations.</p>
<p><code class="language-plaintext highlighter-rouge">@PactTestFor</code> at the class level allows us to specify the providerName, this value is important as it will need to match the name used when we build the provider tests. <code class="language-plaintext highlighter-rouge">providerType</code> indicates that this is a test for an asynchronous system and the <code class="language-plaintext highlighter-rouge">pactVersion</code> allows us to declare the Pact version (V3 in this case).</p>
<p><code class="language-plaintext highlighter-rouge">@Pact</code> is where we specify the name of our consumer, again it is important to ensure this matches the name that is given to the provider side of the Pact test. In the <code class="language-plaintext highlighter-rouge">pact</code> method itself we are able to use the <code class="language-plaintext highlighter-rouge">PactDslJsonBody</code> to define the structure of our contract.</p>
<p><code class="language-plaintext highlighter-rouge">@PactTestFor</code> on our test method is where we tell Pact that the <code class="language-plaintext highlighter-rouge">contractDetailPact</code> method will provide the messages we want to test against our consumer method to ensure that it is able to process the message structure as expected. In this example we are using a default <code class="language-plaintext highlighter-rouge">ByteArrayDeserializer</code> from the Apache Kafka library for message deserialisation. For brevity we are using an object mapper to mimic the deserialisation of the message from bytes. If you are using a custom deserialiser you can use that code to deserialise the message to ensure that your deserialiser can also handle the structure of the Pact message defined.</p>
<p>Once the test is run and the message was successfully processed by our consumer, a Pact contract is generated and stored in our target/pacts directory by default. Once this is complete we can use the following maven command <code class="language-plaintext highlighter-rouge">mvn pact:publish</code> to publish our contract to our Pact broker, where it will be verified against our producer to ensure that the messages it produces are what we expect.</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
</span><span class="nl">"consumer"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"playerContractConsumer"</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"messages"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"contents"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"documentType"</span><span class="p">:</span><span class="w"> </span><span class="s2">"contract"</span><span class="p">,</span><span class="w">
</span><span class="nl">"duration"</span><span class="p">:</span><span class="w"> </span><span class="s2">"5 years"</span><span class="p">,</span><span class="w">
</span><span class="nl">"firstName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Lebron"</span><span class="p">,</span><span class="w">
</span><span class="nl">"lastName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"James"</span><span class="p">,</span><span class="w">
</span><span class="nl">"salary"</span><span class="p">:</span><span class="w"> </span><span class="s2">"158 million USD"</span><span class="p">,</span><span class="w">
</span><span class="nl">"team"</span><span class="p">:</span><span class="w"> </span><span class="s2">"LA Lakers"</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">"A player contract"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchingRules"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"body"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"$.documentType"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"$.duration"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"$.firstName"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"$.lastName"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"$.salary"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"$.team"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"combine"</span><span class="p">:</span><span class="w"> </span><span class="s2">"AND"</span><span class="p">,</span><span class="w">
</span><span class="nl">"matchers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"match"</span><span class="p">:</span><span class="w"> </span><span class="s2">"type"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"metaData"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"contentType"</span><span class="p">:</span><span class="w"> </span><span class="s2">"application/json"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nl">"metadata"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"pact-jvm"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"4.3.13"</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"pactSpecification"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"3.0.0"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="nl">"provider"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"playerContractProducer"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<h3 id="provider">Provider</h3>
<p>The producer contains a simple rest endpoint that takes a new player contract as a request body.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@RestController</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">PlayerContractController</span> <span class="o">{</span>
<span class="nd">@Autowired</span>
<span class="kd">private</span> <span class="nc">PlayerContractProducer</span> <span class="n">playerContractProducer</span><span class="o">;</span>
<span class="nd">@PostMapping</span><span class="o">(</span><span class="s">"/sign"</span><span class="o">)</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">createDraftContract</span><span class="o">(</span><span class="nd">@RequestBody</span> <span class="nc">PlayerContract</span> <span class="n">contract</span><span class="o">)</span> <span class="o">{</span>
<span class="n">playerContractProducer</span><span class="o">.</span><span class="na">send</span><span class="o">(</span><span class="n">contract</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">playerContract</code> object is mapped into a <code class="language-plaintext highlighter-rouge">ContractDetails</code> object and sent to the specified topic using a default Kafka template.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Component</span>
<span class="nd">@RequiredArgsConstructor</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">PlayerContractProducer</span> <span class="o">{</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">KafkaTemplate</span><span class="o"><</span><span class="nc">String</span><span class="o">,</span> <span class="nc">ContractDetails</span><span class="o">></span> <span class="n">template</span><span class="o">;</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">PlayerContractMapper</span> <span class="n">contractMapper</span><span class="o">;</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">Logger</span> <span class="n">logger</span> <span class="o">=</span> <span class="nc">LoggerFactory</span><span class="o">.</span><span class="na">getLogger</span><span class="o">(</span><span class="nc">PlayerContractProducer</span><span class="o">.</span><span class="na">class</span><span class="o">);</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">sendContractDetails</span><span class="o">(</span><span class="nc">PlayerContract</span> <span class="n">playerContract</span><span class="o">)</span> <span class="o">{</span>
<span class="n">template</span><span class="o">.</span><span class="na">send</span><span class="o">(</span><span class="s">"contract-details"</span><span class="o">,</span> <span class="n">contractMapper</span><span class="o">.</span><span class="na">mapContractDetails</span><span class="o">(</span><span class="n">playerContract</span><span class="o">));</span>
<span class="n">logger</span><span class="o">.</span><span class="na">info</span><span class="o">(</span><span class="s">"Contract produced to topic!"</span><span class="o">);</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p>The test for the producer is implemented as below:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Provider</span><span class="o">(</span><span class="s">"playerContractProducer"</span><span class="o">)</span>
<span class="nd">@Consumer</span><span class="o">(</span><span class="s">"playerContractConsumer"</span><span class="o">)</span>
<span class="nd">@PactBroker</span><span class="o">(</span><span class="n">url</span> <span class="o">=</span> <span class="s">"http://localhost:9292"</span><span class="o">)</span>
<span class="kd">class</span> <span class="nc">PlayerContractMapperTest</span> <span class="o">{</span>
<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="nc">String</span> <span class="no">JSON_CONTENT_TYPE</span> <span class="o">=</span> <span class="s">"application/json"</span><span class="o">;</span>
<span class="kd">private</span> <span class="kd">static</span> <span class="kd">final</span> <span class="nc">String</span> <span class="no">KEY_CONTENT_TYPE</span> <span class="o">=</span> <span class="s">"contentType"</span><span class="o">;</span>
<span class="kd">private</span> <span class="kd">final</span> <span class="nc">PlayerContractMapper</span> <span class="n">contractMapper</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">PlayerContractMapper</span><span class="o">();</span>
<span class="nd">@BeforeEach</span>
<span class="kt">void</span> <span class="nf">before</span><span class="o">(</span><span class="nc">PactVerificationContext</span> <span class="n">context</span><span class="o">)</span> <span class="o">{</span>
<span class="n">context</span><span class="o">.</span><span class="na">setTarget</span><span class="o">(</span><span class="k">new</span> <span class="nc">MessageTestTarget</span><span class="o">());</span>
<span class="o">}</span>
<span class="nd">@TestTemplate</span>
<span class="nd">@ExtendWith</span><span class="o">(</span><span class="nc">PactVerificationInvocationContextProvider</span><span class="o">.</span><span class="na">class</span><span class="o">)</span>
<span class="kt">void</span> <span class="nf">pactVerificationTestTemplate</span><span class="o">(</span><span class="nc">PactVerificationContext</span> <span class="n">context</span><span class="o">)</span> <span class="o">{</span>
<span class="n">context</span><span class="o">.</span><span class="na">verifyInteraction</span><span class="o">();</span>
<span class="o">}</span>
<span class="nd">@PactVerifyProvider</span><span class="o">(</span><span class="s">"A player contract"</span><span class="o">)</span>
<span class="nc">MessageAndMetadata</span> <span class="nf">verifyMessage</span><span class="o">()</span> <span class="o">{</span>
<span class="nc">PlayerContract</span> <span class="n">playerContract</span> <span class="o">=</span> <span class="nc">PlayerContract</span><span class="o">.</span><span class="na">builder</span><span class="o">()</span>
<span class="o">.</span><span class="na">age</span><span class="o">(</span><span class="mi">37</span><span class="o">)</span>
<span class="o">.</span><span class="na">dateSigned</span><span class="o">(</span><span class="nc">LocalDate</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="mi">2022</span><span class="o">,</span> <span class="mi">4</span><span class="o">,</span> <span class="mi">3</span><span class="o">))</span>
<span class="o">.</span><span class="na">documentType</span><span class="o">(</span><span class="s">"contract"</span><span class="o">)</span>
<span class="o">.</span><span class="na">firstName</span><span class="o">(</span><span class="s">"Lebron"</span><span class="o">)</span>
<span class="o">.</span><span class="na">lastName</span><span class="o">(</span><span class="s">"James"</span><span class="o">)</span>
<span class="o">.</span><span class="na">team</span><span class="o">(</span><span class="s">"LA Lakers"</span><span class="o">)</span>
<span class="o">.</span><span class="na">position</span><span class="o">(</span><span class="s">"Power Forward"</span><span class="o">)</span>
<span class="o">.</span><span class="na">duration</span><span class="o">(</span><span class="s">"5 years"</span><span class="o">)</span>
<span class="o">.</span><span class="na">salary</span><span class="o">(</span><span class="s">"158 million USD"</span><span class="o">)</span>
<span class="o">.</span><span class="na">build</span><span class="o">();</span>
<span class="nc">JsonSerializer</span><span class="o"><</span><span class="nc">ContractDetails</span><span class="o">></span> <span class="n">serializer</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">JsonSerializer</span><span class="o"><>();</span>
<span class="k">return</span> <span class="k">new</span> <span class="nf">MessageAndMetadata</span><span class="o">(</span><span class="n">serializer</span><span class="o">.</span><span class="na">serialize</span><span class="o">(</span><span class="s">"kafka-pact"</span><span class="o">,</span> <span class="n">contractMapper</span><span class="o">.</span><span class="na">mapContractDetails</span><span class="o">(</span><span class="n">playerContract</span><span class="o">)),</span>
<span class="nc">Map</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="no">KEY_CONTENT_TYPE</span><span class="o">,</span> <span class="no">JSON_CONTENT_TYPE</span><span class="o">));</span>
<span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">@Provider</code> lets us tell Pact the name of our provider which should match whatever was specified in the consumer test.</p>
<p><code class="language-plaintext highlighter-rouge">@Consumer</code> lets us tell Pact the name of the specified consumer we are testing against, again it must match what we specified in the consumer test.</p>
<p><code class="language-plaintext highlighter-rouge">@PactBroker</code> is where we specify the url of our Pact broker where our consumer contract is stored.</p>
<p>Since it is actually the responsibility of the <code class="language-plaintext highlighter-rouge">ContractMapper</code> within our project to ensure that the message is in the correct format, that is the class that we will unit test using Pact. The result of the <code class="language-plaintext highlighter-rouge">mapContractDetails</code> method call is then serialised and verified against the contract that was generated and published to the broker by the consumer.</p>
<p>In this instance it seems like we have missed off two important fields that are needed by our consumer - <code class="language-plaintext highlighter-rouge">salary</code> and <code class="language-plaintext highlighter-rouge">team</code>:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Getter</span>
<span class="nd">@Builder</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">ContractDetails</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">documentType</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">firstName</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">lastName</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">duration</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>
<p>As a result the test run failed with the following error:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1<span class="o">)</span> A player contract: generates a message which has a matching body
1.1<span class="o">)</span> body: <span class="nv">$ </span>Actual map is missing the following keys: salary, team
<span class="o">{</span>
<span class="s2">"documentType"</span>: <span class="s2">"contract"</span>,
<span class="s2">"duration"</span>: <span class="s2">"5 years"</span>,
<span class="s2">"firstName"</span>: <span class="s2">"Lebron"</span>,
- <span class="s2">"lastName"</span>: <span class="s2">"James"</span>,
- <span class="s2">"salary"</span>: <span class="s2">"158 million USD"</span>,
- <span class="s2">"team"</span>: <span class="s2">"LA Lakers"</span>
+ <span class="s2">"lastName"</span>: <span class="s2">"James"</span>
<span class="o">}</span>
</code></pre></div></div>
<p>The result of the test failure has now been published to the pact broker:</p>
<p><img src="/images/2022-09-05-pact-contract-testing-with-kafka/pact-test-failure.png" /></p>
<p>Once we update our code with the missing fields and retest:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">@Getter</span>
<span class="nd">@Builder</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">ContractDetails</span> <span class="o">{</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">documentType</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">firstName</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">lastName</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">team</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">duration</span><span class="o">;</span>
<span class="kd">private</span> <span class="nc">String</span> <span class="n">salary</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>
<p>We now have a passing build, giving us confidence to push our producer code knowing that it does not contain any breaking changes for our consumer:</p>
<p><img src="/images/2022-09-05-pact-contract-testing-with-kafka/pact-test-pass.png" /></p>
<p>All this was done locally without having to deploy both our consumer and producer into an environment, saving us from a lot of wasted time and effort.</p>
<p><a href="https://capgemini.github.io/development/pact-contract-testing-with-kafka/">Consumer Driven Contract Testing with Pact, Kafka and Spring Boot</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on January 06, 2023.</p>https://capgemini.github.io/devsecops/platform-for-product-oriented-teams2022-12-02T00:00:00+00:002022-12-02T00:00:00+00:00Sarah Saundershttps://capgemini.github.io/authors#author-sarah-saunders
<p><a href="https://www.gartner.com/smarterwithgartner/cio-agenda-2019-move-from-project-to-product-delivery">Gartner</a> told us that by 2022, 80% of us would have moved to a more <a href="https://www.infosys.com/iki/perspectives/product-centric-value-delivery.html">product-centric IT operating model</a>. What does this mean, and more specifically, what does it mean for us as software engineering consultants?</p>
<h2 id="products---delivering-value">Products - Delivering Value</h2>
<p>First, some definitions. What do we mean by “Product”? For me, this is a very business-oriented term. Say you’re a dairy farm, your products might be milk,
cheese, ice-cream. This maps to the Agile definiton of “Product” as a vehicle to <a href="https://scrumguides.org/scrum-guide.html#product-backlog">deliver value</a>;
as in “Product Owner” - perhaps the cheesemaker in the dairy - the person who understands how applications impact their business and is ultimately responsible
for deciding which software changes are built and released by a team. And the important concepts behind product-oriented teams, for us software engineers,
are twofold.</p>
<p>One, there is the concept of funding. For a business to fund a product rather than a project implies no end date to the funding stream, which fits
in much better with agile practices, for example we only plan in detail for the next sprint, and it supports the creation of long-lived teams which can
become much more efficient at delivery.</p>
<p>Second, there is the concept of having full focus on delivering business value. This isn’t new, and is probably the goal for all sprint teams, but anyone who’s worked in an agile
team will have hit that sprint where they have to deliver tech debt - perhaps they have to patch some libraries, or restructure their data schemas to improve application performance, or
do something in the DevOps space such as implement blue/green releases. The product
owner is on board with the need to do these things, but isn’t really that interested in the implementation and really just wants these problems to “go away” so
they can get back to more important sprint stories. Hence the emphasis on product teams focusing on value delivery and customer satisfaction.</p>
<h3 id="unicorns">Unicorns</h3>
<p>You’ve probably spotted the flaw in this idea - how can the IT teams focus solely on “value delivery” and STILL provide something performant and secure?
How does all the other stuff get done? What about creating delivery pipelines and test frameworks and patching strategies and scaling mechanisms? What about
all the DevOps tasks such as making our app supportable and observable? This HAS to be done, but it isn’t accounted for in the job definition of product-centric teams.</p>
<h2 id="zones-of-repeatability">Zones of Repeatability</h2>
<p>The good news about all the important DevOps style work which our product-centric teams don’t have time to do is that whilst it’s heavyweight work, most of it is
highly repeatable. This means that solutions can be shared between product teams.</p>
<h3 id="platforms---the-new-platforms">Platforms - the new Platforms!</h3>
<p>In other words, what we need is a “common platform”. This isn’t a new idea, and exists in some
form or other in most companies - complete with a Platform Team in charge of creating and maintaining it. These platforms, however, are often destined for
failure - and by failure, I mean they do not get used and do not provide the advantages that they promised, for a number of common reasons that we would want to avoid when defining our platform for supporting product-centric teams. I’ll list out some
of the major pitfalls here</p>
<h4 id="1-the-gap">1. The Gap</h4>
<p>You might be familiar with the idea of “Goldilocks problems” - in the story of Goldilocks and the three bears, she always tries the two extremes before she gets
it right. The first porridge is far too hot, the second is far too cold. Similar things tend to happen to platform teams when they are given the remit
to create a common developer platform - on the one hand, they create a platform which doesn’t do enough for the developer and as such, isn’t an accelerator
and doesn’t save them any time. An example of this might be when a company first decides to build a shared platform and hires some platform engineers with a
very loose remit. The engineers think of some tools the devs might like, and provision them somewhere, and create some roles and access rights, but without
an understanding of what the developers do (or want to do) this isn’t going to be usable.</p>
<h4 id="2-should-vs-could">2. Should vs Could</h4>
<p>Fellow blogger <a href="https://capgemini.github.io/authors/#author-chris-burns">Chris Burns</a> came up with this title. The concept comes from the speech
by Dr. Ian Malcolm in Jurassic Park -
“your scientists were so preoccupied with whether or not they <em>could</em>, that they didn’t stop to think if they <em>should</em>”. This might be what happens when you
throw money and resources at a platform problem - you hire the best DevOps engineers, they have incredible skills with tools like Jenkins - they can script
their way around a whole bunch of issues and failures, but again, they’re not focussed on the right problem. Their remit is still too loose and they haven’t
got an eye on who their customer is, and because customer satisfaction isn’t set as top of their priorities, they’ll create a monster that is frankly unusable
by the dev team and quite possibly also a security hazard. We’ve seen examples of this where the developer requests a job to tear down an environment, for example. Perhaps they have a bad test which relies on certain data in the database - the solution to this is to rewrite the test. The platform team’s solution might be to create some heinous post-cleanup script to reintroduce the required data before the test runs. Yes you can do that - but it’s not the right answer!</p>
<h4 id="3-devs-in-chains">3. Devs in Chains</h4>
<p>This pitfall is at the reverse end of the “Goldilocks Problem” - rather than having a platform that is too loosely defined, you create one that is too
locked down. It might be very efficient at what it does, but if it doesn’t do what the developers need it’s still a failure because they won’t use it. For
example, perhaps you provide build pipelines without the ability to edit what they do. They may deploy containers very efficiently; but what if you suddenly
want to add a serverless function to your architecture? What if you want to run a different set of tests? The platform should not dictate what the dev teams can do to this level.
Such restrictive platforms could also result in
security issues. Developers are an extremely creative bunch, if you try to lock down access to the databases, for example, some bright spark might realise that
if they simply add an ‘apt get install postgres’ into the Dockerfile of the Java applications deployed into their Kubernetes cluster, they could then
ssh into the pod when it was deployed and use the PostgreSQL client installed into the container as a “back door” to access the database. Argh!</p>
<h2 id="a-new-mindset">A New Mindset</h2>
<p>So what’s the solution? We as developers know it very well. When we’re building applications for our product owners, we work in an agile way - we create
friendly “user stories” and break them down into tasks that can be delivered quickly, we build a minimum viable product and get it out for some user feedback
and then we iteratively improve from there. So why on earth don’t we do that when we build platforms?! Why do we hire a bunch of highly talented platform
engineers and hide them away behind a service desk interface, creating the “Dev/Ops gap” we have been trying so hard to break down all these years?
All that we need is for the platform team to reverse its mindset. To remember that the developer is the <em>customer</em> in this scenario, and in the same way
as IT customers the whole world over, they do not really know what they want.</p>
<h3 id="needs-versus-requirements">Needs versus Requirements</h3>
<p>If a platform engineer adopts this customer-facing mindset and sits down with a developer to list out what they want, the developer might say “I want access
to the production servers and all the databases”. Our engineer needs to be ready here to translate this into what they actually <em>need</em>, which is the ability
to deploy applications, to observe application behaviour, and to make changes when necessary. This isn’t quite how the dev has phrased it! So we need to
bring to the table all the skills from agile methodologies and also from practices such as <a href="https://en.wikipedia.org/wiki/Domain-driven_design">domain-driven design</a>
to make sure we are getting our customer requirements right.</p>
<h3 id="build-an-mvp">Build an MVP</h3>
<p>When you look at the <a href="https://landscape.cncf.io/">CNCF Cloud Native landscape</a> it’s easy to become overwhelmed as to how you are going to build a platform
which covers off all of these boxes. Of course, the way to break down this complexity is to start with a Minimum Viable <strong>Platform</strong> the same way we would build
an MVP for a complex application. Figure out what you need first, figure out where the risk is, get those bits in place and working and iteratively improve
from there.</p>
<h3 id="the-paved-road">The Paved Road</h3>
<p>The secret to not being too restrictive is to follow <a href="https://www.infoq.com/news/2017/06/paved-paas-netflix/">Netflix’s example</a> of building a “paved road” across the CNCF landscape without locking dev
teams down to a certain path. For example, we know we will need some kind of pipeline automation software to run builds and other deployment-related jobs.
But which one to use? It doesn’t really matter - just make sure that your platform is built in a sufficiently layered, pluggable way and then put in a
suggested tool - say, Tekton chains - and if there is one team who REALLY want to use Concourse for some reason, they are welcome to configure the platform
and change the pipeline tool. The “paved road” is there as an accelerator for teams who don’t have an opinion on which pipeline to use, but they are not
forced to use this route.</p>
<h2 id="our-opinionated-stack">Our Opinionated Stack</h2>
<p><img src="/images/2022-10-07-platform-for-product-oriented-teams/StackScope.jpg" alt=""CREATE opinionated stack scope"" /></p>
<p>The Cloud Development team at Capgemini have created our own “Paved Road” through the CNCF landscape. We’ve called it CREATE - the Cloud Ready Environment
for Application Test and Execution. It takes the principles of zero trust; customer centricity; automation first; separation of concerns. It will be open-sourced
and uses mainly open source components. We’ve assumed that the cloud platform will be Kubernetes-as-a-service, as this is a good abstraction from
specific cloud vendors whilst allowing scale-to-zero for maximum compute efficiency. We have defined the necessary pods for both tooling and deployment
using Terraform and Helm. We’ve separated out Continuous Integration from Continuous Delivery, with separate pipelines and separate permissions for each workflow,
and we add <a href="https://goharbor.io/">Harbor</a> as a place where built artifacts can be interrogated, signed and stored securely.</p>
<p>We’ve used the wonderful <a href="https://backstage.io/">Backstage portal</a> from Spotify to build our developer interface - it’s
so much more friendly and intuitive than trying a “square peg/ round hole” approach of building a portal out of service desk or issue tracking software.</p>
<p><img src="/images/2022-10-07-platform-for-product-oriented-teams/JIRAvBackstage.jpg" alt=""JIRA vs Backstage"" /></p>
<p>The remit of CREATE is to provide a catalogue of template applications - a React app, for example, and a Spring Boot app - from which the developer can
choose what best fits their need. It will then manage the whole <a href="https://about.gitlab.com/topics/gitops/">GitOps workflow</a> - creating in the cloud the infrastructure
for running the build (pipelines, quality analysis tools etc) and also the infrastructure for hosting the application.
It provides a template to deploy useful peripherals such as a credentials vault, authentication,
monitoring/logging/tracing tools. We don’t mind which ones - we provide a default, but they’re pluggable; we’re more attached to the principles than the tooling choice itself.
CREATE will then set up a repository for the template application, build and deploy it, and leave the build pipeline waiting for changes. It will
automatically auto-scale based on load. And all this in under an hour. Yes it
has its limitations - it’s designed for containerised applications hosted on Kubernetes - but anyone who’s worked on applications in this space will
appreciate just how much time and effort it will save.</p>
<p>If you’d like to find out more about CREATE, and receive notification of when we release our MVP, please get in touch with
<a href="https://www.linkedin.com/in/sarahsaunders1/">myself</a> or <a href="https://www.linkedin.com/in/chris-j-burns/">Chris Burns</a> via Linkedin.</p>
<p><a href="https://capgemini.github.io/devsecops/platform-for-product-oriented-teams/">Platforms to support Product Oriented Teams</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on December 02, 2022.</p>https://capgemini.github.io/development/reference-panel-in-dynamics-3652022-07-15T00:00:00+01:002022-07-15T00:00:00+01:00Venkata Ravi Babu Vakalapudihttps://capgemini.github.io/authors#author-venkata-ravi-babu-vakalapudi
<h2 id="introduction">Introduction</h2>
<p>The reference panel is a valuable feature in Dynamics 365. Subgrids generally occupy a good amount of space in a form. When we need to add multiple subgrids in the same form, it not only impacts the performance, the UI also doesn’t look good for the users. The reference panel overcomes this issue.</p>
<h2 id="what-is-the-reference-panel">What is the reference panel?</h2>
<p>As per <a href="https://docs.microsoft.com/en-us/dynamics365/customerengagement/on-premises/customize/section-properties-legacy?view=op-9-1">MSDN</a>, a reference panel is a single column section. You can insert subgrids, quick view control, or a knowledge base search control inside a reference panel section. Each control that you added in the reference panel appears as a vertical tab within the panel at runtime. You can drag and drop the various controls within the reference panel section. The default tab at the runtime is the first control added to the reference panel. The other tabs appear in the order in which they are added in the form editor. To delete a tab, use the delete key on your keyboard.</p>
<p>When you insert a reference panel, by default, it’s added as the last section in the tab. You can add only one reference panel per form.</p>
<h2 id="steps-to-create">Steps to create</h2>
<ol>
<li>As a system administrator, add your entity (account in this scenario) to a solution.</li>
<li>Open the entity form on which you wanted to insert the reference panel.</li>
<li>Go to Insert -> Section -> Reference Panel
<img src="/images/2022-04-14_ReferencePanel_min.png" alt="Reference Panel." class="centered medium-8" /></li>
<li>Once you click on the reference panel, a section will be created as the last section in the tab.
<img src="/images/2022-04-14_RP_NewSection_min.png" alt="Reference Panel new section." class="centered medium-8" /></li>
<li>Then insert the multiple subgrids as per your requirement in this section.
<img src="/images/2022-04-14_RP_AddingSubgrid_min.png" alt="Add subgrids to the section." class="centered medium-8" /></li>
<li>Then save and publish the form, you will see a subgrid with multiple buttons attached to the subgrid as shown below. By clicking on each button, the subgrid will dynamically change the views.
<img src="/images/2022-04-14_RP_Result_min.png" alt="Final Result." class="centered medium-8" /></li>
</ol>
<p>Hope you learn something new today.</p>
<p>Happy Learning.</p>
<p><a href="https://capgemini.github.io/development/reference-panel-in-dynamics-365/">Reference panel in Dynamics 365</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on July 15, 2022.</p>https://capgemini.github.io/development/contributing-to-dcx-react-library2022-05-27T00:00:00+01:002022-05-27T00:00:00+01:00Isaac Babalolahttps://capgemini.github.io/authors#author-isaac-babalola
<h2 id="introduction">Introduction</h2>
<p>At Capgemini within the DCX (Digital Customer Experience) team we have built and released the first style-agnostic React component library, which provides consumers with a suite of tested React components that can be re-used within any React front end thereby speeding up the process of beginning a new project.</p>
<p>In a previous <a href="https://capgemini.github.io/development/dcx-react-library/">blog post</a> we introduced the DCX React component library and in this blog post I will be explaining the process by which we created the library and how you can contribute to the ever-growing list of React components.</p>
<p>At the time of writing this post, a 0.4 version of the library was released to the public <a href="https://www.npmjs.com/package/@capgeminiuk/dcx-react-library">npm registry</a> where the full suite of currently available React components can be viewed within the <a href="https://6069a6f47f4b9f002171f8e1-bqlntwzjjl.chromatic.com/?path=/story/dcxlibrary-introduction--page">storybook</a>.</p>
<h2 id="assumption">Assumption</h2>
<p>As I will be explaining the process by which we created a React component library, it is assumed that you are familiar with JavaScript, React and TypeScript.</p>
<h2 id="stage-1-creating-the-library">Stage 1: Creating the library</h2>
<p>Based on the growing popularity of <a href="https://yarnpkg.com/">yarn</a> over the last 5 years and the performance benefits over the <a href="https://www.npmjs.com/package/npm">npm</a> package manager we decided to use this as our package manager.</p>
<p>During our initial research for tools to help build the component library we noticed that the React ecosystem for building web applications was quite saturated with tools like <a href="https://create-react-app.dev/">Create React App (CRA)</a>, <a href="https://nextjs.org/">Next.js</a> and <a href="https://remix.run/">Remix</a> but the options for building a React component library were limited.</p>
<p>To create the library, we opted to use <a href="https://github.com/developit/microbundle">microbundle</a>, which is a “zero-configuration bundler for tiny modules”.</p>
<p>The microbundle is a wrapper around <a href="https://rollupjs.org/guide/en/">rollup</a> with predefined defaults such as minification and compression that produces nicely formatted stats, multiple target formats, ES modules, CommonJS and UMD.</p>
<p>The bundle sizes are small because we made a conscious effort to avoid importing external libraries, thereby minimising our dependency on external code, however in some cases to avoid reinventing the wheel we selectively imported libraries like <a href="https://www.npmjs.com/package/lodash">lodash</a> with a small number of external dependencies to reduce our exposure to vulnerabilities.</p>
<p>As you can see below, the library’s bundles are small and available in multiple targets:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 10 kB: dcx-react-library.js.gz
8.9 kB: dcx-react-library.js.br
9.34 kB: dcx-react-library.modern.js.gz
8.36 kB: dcx-react-library.modern.js.br
9.92 kB: dcx-react-library.module.js.gz
8.86 kB: dcx-react-library.module.js.br
10.1 kB: dcx-react-library.umd.js.gz
8.97 kB: dcx-react-library.umd.js.br
</code></pre></div></div>
<p>More importantly because of the benefits of type scripting we were mostly interested in the “out of the box” support it has for <a href="https://www.typescriptlang.org/">TypeScript</a>.</p>
<p>To configure the microbundle we added the following properties within the project’s <code class="language-plaintext highlighter-rouge">package.json</code> file to specify where the input files are to come from, the location where the output bundles are to be placed and the directory from where the TypeScript types are placed.</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
</span><span class="nl">"source"</span><span class="p">:</span><span class="w"> </span><span class="s2">"src/index.ts"</span><span class="p">,</span><span class="w">
</span><span class="nl">"main"</span><span class="p">:</span><span class="w"> </span><span class="s2">"dist/dcx-react-library.js"</span><span class="p">,</span><span class="w">
</span><span class="nl">"module"</span><span class="p">:</span><span class="w"> </span><span class="s2">"dist/dcx-react-library.module.js"</span><span class="p">,</span><span class="w">
</span><span class="nl">"unpkg"</span><span class="p">:</span><span class="w"> </span><span class="s2">"dist/dcx-react-library.umd.js"</span><span class="p">,</span><span class="w">
</span><span class="nl">"typings"</span><span class="p">:</span><span class="w"> </span><span class="s2">"dist/index.d.ts"</span><span class="p">,</span><span class="w">
</span><span class="nl">"files"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"dist"</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<h2 id="stage-2-setting-up-the-library">Stage 2: Setting up the library</h2>
<p>To ensure that we build the library to a high standard there were several tools and processes we put in place to aid our efforts.</p>
<p>Firstly, we added a <code class="language-plaintext highlighter-rouge">.gitignore</code> file to exclude the generated file from our remote repository.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> *.log
.DS_Store
node_modules
.cache
.idea
dist
coverage
.parcel-cache
example/.parcel-cache/*
example/build
storybook-static
</code></pre></div></div>
<p>Secondly, we added both an <code class="language-plaintext highlighter-rouge">.eslintrc.json</code> and an <code class="language-plaintext highlighter-rouge">.eslintignore</code> file to enforce a number of rules to guide the standard of written code and to specify which directories should not be linted.</p>
<p>Thirdly, we added <a href="https://jestjs.io/">Jest</a>, the testing framework maintained by <a href="https://en.wikipedia.org/wiki/Meta_Platforms">Meta (formerly Facebook)</a>, to the project to aid our test-driven development.</p>
<p>As a precedence, to ensure the reliability of the application code we specified that all branches, functions, lines, and statements should be at 100% within the library. As it stands, we currently have 100% test coverage on the application code.</p>
<p>Below is the current configuration specified in the project’s <code class="language-plaintext highlighter-rouge">jest.config.ts</code> file:</p>
<div class="language-ts highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kr">module</span><span class="p">.</span><span class="nx">exports</span> <span class="o">=</span> <span class="p">{</span>
<span class="na">preset</span><span class="p">:</span> <span class="dl">'</span><span class="s1">ts-jest</span><span class="dl">'</span><span class="p">,</span>
<span class="na">testEnvironment</span><span class="p">:</span> <span class="dl">'</span><span class="s1">jsdom</span><span class="dl">'</span><span class="p">,</span>
<span class="na">collectCoverage</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
<span class="na">coverageReporters</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">json</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">lcov</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">text</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">html</span><span class="dl">'</span><span class="p">],</span>
<span class="na">coverageThreshold</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">global</span><span class="dl">'</span><span class="p">:</span> <span class="p">{</span>
<span class="dl">'</span><span class="s1">branches</span><span class="dl">'</span><span class="p">:</span> <span class="mi">100</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">functions</span><span class="dl">'</span><span class="p">:</span> <span class="mi">100</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">lines</span><span class="dl">'</span><span class="p">:</span> <span class="mi">100</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">statements</span><span class="dl">'</span><span class="p">:</span> <span class="mi">100</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Our components have been unit tested using the <a href="https://testing-library.com/docs/react-testing-library/intro/">React Test Library</a> created by Kent C. Dodds</p>
<p>To standardise the format of the commit messages we turned to <a href="https://commitlint.js.org/">commitlint</a>, a linter for commit messages.</p>
<p>Commitlint ensures that messages MUST be prefixed with one of the following depending on the content of the commit.</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="o">[</span>build, chore, ci, docs, feat, fix, perf, refactor, revert, style, <span class="nb">test</span><span class="o">]</span> <span class="o">[</span>type-enum]
git commit <span class="nt">-m</span> <span class="s2">"build: {{ name of build config change }}"</span>
git commit <span class="nt">-m</span> <span class="s2">"feat: {{ name of feature }}"</span>
git commit <span class="nt">-m</span> <span class="s2">"fix: {{ name of bug }}"</span>
git commit <span class="nt">-m</span> <span class="s2">"test: {{ name of test }}"</span>
</code></pre></div></div>
<p>Finally, as well as the above we added two web hooks to the project to run when contributors commit new code and push code upstream to the remote repository using <a href="https://www.npmjs.com/package/husky">husky</a></p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pre-commit</code>: used to lint the content within the commit, if any lint errors are found then the commit will fail</li>
<li><code class="language-plaintext highlighter-rouge">pre-push</code>: used to start a full jest test run and check for 100% test coverage, if any tests are broken or test coverage is below the set 100% configuration then the push will fail.</li>
</ul>
<h2 id="stage-3-cicd-set-up-for-the-library">Stage 3: CI/CD Set up for the library</h2>
<p>To ensure consistency in the application code on the remote branch we decided on using <a href="https://circleci.com/">circleci</a> within the git workflow to automate the continuous integration. This tool allowed us to maintain the integrity of the combined content within the main and release branches, which integrates directly with the host git repository.</p>
<p>The automated builds were configured with the following <code class="language-plaintext highlighter-rouge">config.yml</code>:</p>
<div class="language-yml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="na">version</span><span class="pi">:</span> <span class="m">2</span>
<span class="na">jobs</span><span class="pi">:</span>
<span class="na">build</span><span class="pi">:</span>
<span class="na">docker</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">image</span><span class="pi">:</span> <span class="s">circleci/node:12.22.0</span>
<span class="na">working_directory</span><span class="pi">:</span> <span class="s">~/repo</span>
<span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">checkout</span>
<span class="c1"># Download and cache dependencies</span>
<span class="pi">-</span> <span class="na">restore_cache</span><span class="pi">:</span>
<span class="na">keys</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">v1-dependencies-{{ checksum "package.json" }}</span>
<span class="c1"># fallback to using the latest cache if no exact match is found</span>
<span class="pi">-</span> <span class="s">v1-dependencies-</span>
<span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">yarn install</span>
<span class="pi">-</span> <span class="na">save_cache</span><span class="pi">:</span>
<span class="na">paths</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">node_modules</span>
<span class="na">key</span><span class="pi">:</span> <span class="s">v1-dependencies-{{ checksum "package.json" }}</span>
<span class="c1"># run lint</span>
<span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">yarn lint</span>
<span class="c1"># run tests!</span>
<span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">yarn test --runInBand --logHeapUsage && ./node_modules/.bin/codecov</span>
<span class="c1"># deploy storybook</span>
<span class="pi">-</span> <span class="na">run</span><span class="pi">:</span> <span class="s">yarn chromatic --project-token=c6317a751fef --auto-accept-changes</span>
</code></pre></div></div>
<h2 id="stage-4-storybook-documentation">Stage 4: Storybook Documentation</h2>
<p>As with most component libraries, we created a Capgemini themed storybook which showcases all the implemented components. It includes a detailed description of each component, including a list of required and optional properties, example styled components and a live preview which gives consumers the ability to play with all the components by editing default props.</p>
<p>Documentation for all components are held in <code class="language-plaintext highlighter-rouge">.mdx</code> files within the <code class="language-plaintext highlighter-rouge">\stories</code> directory.</p>
<p>We have also used a series of addons to build the storybook.</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">module</span><span class="p">.</span><span class="nx">exports</span> <span class="o">=</span> <span class="p">{</span>
<span class="na">stories</span><span class="p">:</span> <span class="p">[</span><span class="dl">'</span><span class="s1">../stories/**/*.stories.@(mdx)</span><span class="dl">'</span><span class="p">],</span>
<span class="na">addons</span><span class="p">:</span> <span class="p">[</span>
<span class="dl">'</span><span class="s1">@storybook/addon-links</span><span class="dl">'</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">@storybook/addon-essentials</span><span class="dl">'</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">@storybook/addon-docs</span><span class="dl">'</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">@storybook/addon-controls</span><span class="dl">'</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">storybook-css-modules-preset</span><span class="dl">'</span><span class="p">,</span>
<span class="dl">'</span><span class="s1">@storybook/addon-a11y</span><span class="dl">'</span>
<span class="p">],</span>
<span class="p">};</span>
</code></pre></div></div>
<h2 id="stage-5-contributing-to-the-react-dcx-library">Stage 5: Contributing to the React DCX library</h2>
<h3 id="introduction-1">Introduction</h3>
<p>The following is an abstract directory tree of the DCX React Library repository with a single component named <code class="language-plaintext highlighter-rouge">ComponentName</code></p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dcx-react-library
├── example/
│ ├── src/
│ │ ├── components/ <span class="c"># add example usage of component</span>
│ │ │ └── ComponentNameDemo.tsx
│ │ └── index.tsx
├── src/
│ ├── componentName/ <span class="c"># the actual component itself</span>
│ │ ├── __test__/
│ │ │ └── ComponentName.test.tsx
│ │ ├── ComponentName.tsx
│ │ └── index.ts
│ └── index.ts
├── static/
├── stories/ <span class="c"># the story demo for component</span>
│ ├── ComponentName/
│ │ ├── Documentation.stories.mdx
│ │ ├── Live.stories.mdx
│ │ ├── Styled.stories.mdx
│ │ └── Unstyled.stories.mdx
│ ├── liveEdit
│ │ └── ComponentNameLive.tsx <span class="c"># the editable render of the component</span>
│ ├── Introduction.stories.mdx
│ └── style.css <span class="c"># styles used within the stories of the components</span>
├── .eslintignore
├── .eslintrc.json
├── .gitignore
├── CHANGELOG.md
├── CONTRIBUTING.md
├── jest.config.ts
├── LICENSE
├── netlify.toml
├── package.json
├── README.md
├── setup.sh
├── tsconfig.json
└── yarn.lock
</code></pre></div></div>
<p>Now the fun part, contributing to the library. This can be done in a few ways:</p>
<ol>
<li>Adding a new component</li>
<li>Updating documentation</li>
<li>Enhancing an existing component</li>
<li>Resolving bugs</li>
<li>Improving accessibility</li>
</ol>
<h3 id="adding-a-new-component">Adding a new component</h3>
<p>The first step would be to decide what kind of component you would like to add, what functionalities the component should support based on consumer needs and how the component can broaden the breadth of the library.</p>
<p>After deciding on the above the first thing to do will be to add a directory for your component within the <code class="language-plaintext highlighter-rouge">src</code> directory.</p>
<p>Add a <code class="language-plaintext highlighter-rouge">.tsx</code> file for your component with a file name matching the name of your component e.g. <code class="language-plaintext highlighter-rouge">ComponentName.tsx</code></p>
<p>Within your <code class="language-plaintext highlighter-rouge">ComponentName.tsx</code> add an initial export, for example</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kd">type</span> <span class="nx">ComponentNameProps</span> <span class="o">=</span> <span class="p">{</span>
<span class="cm">/**
* a property for Component Name
*/</span>
<span class="na">componentProperty</span><span class="p">:</span> <span class="kr">any</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">ComponentName</span> <span class="o">=</span> <span class="p">({</span> <span class="nx">componentProperty</span> <span class="p">}:</span> <span class="nx">ComponentNameProps</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="c1">// implementation will go here</span>
<span class="k">return</span> <span class="p">(</span>
<span class="c1">// render component code will go here</span>
<span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>
<p>To export the component for use, the newly added component will need to be added to the component’s <code class="language-plaintext highlighter-rouge">index.ts</code> file, for example</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">./ComponentName</span><span class="dl">'</span><span class="p">;</span>
</code></pre></div></div>
<p>then within the <code class="language-plaintext highlighter-rouge">src/index.ts</code> file the full list of exports within the <code class="language-plaintext highlighter-rouge">componentName</code> directory can be exported by adding the following.</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">export</span> <span class="o">*</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">./componentName</span><span class="dl">'</span><span class="p">;</span>
</code></pre></div></div>
<p>Once complete with the above you can start the implementation by adding unit tests for the component within a <code class="language-plaintext highlighter-rouge">ComponentName.test.tsx</code> test file in the <code class="language-plaintext highlighter-rouge">src/ComponentName/__test__/</code> directory.</p>
<p>Using the <a href="https://testing-library.com/docs/react-testing-library/intro/">React Testing Library</a> you will now be in a position to write a test, for example</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="nx">React</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">react</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">render</span><span class="p">,</span> <span class="nx">screen</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@testing-library/react</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="dl">'</span><span class="s1">@testing-library/jest-dom</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">../ComponentName</span><span class="dl">'</span><span class="p">;</span>
<span class="nx">describe</span><span class="p">(</span><span class="dl">'</span><span class="s1">ComponentName</span><span class="dl">'</span><span class="p">,</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">it</span><span class="p">(</span><span class="dl">'</span><span class="s1">should render</span><span class="dl">'</span><span class="p">,</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">render</span><span class="p">(<</span><span class="nc">ComponentName</span> <span class="na">componentProperty</span><span class="p">=</span><span class="s">"some-property"</span> <span class="p">/>);</span>
<span class="nx">expect</span><span class="p">(</span><span class="nx">screen</span><span class="p">..</span><span class="nx">getByText</span><span class="p">(</span><span class="dl">'</span><span class="s1">some-property</span><span class="dl">'</span><span class="p">)).</span><span class="nx">toBeInTheDocument</span><span class="p">();</span>
<span class="p">});</span>
<span class="p">});</span>
</code></pre></div></div>
<p>Once you’ve added a feature to your <code class="language-plaintext highlighter-rouge">ComponentName</code> you’ll be able to add the component to a <code class="language-plaintext highlighter-rouge">ComponentNameDemo.tsx</code> file that can be added to the <code class="language-plaintext highlighter-rouge">example/src/components/</code> directory, as follows:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="nx">React</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">react</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@capgeminiuk/dcx-react-library</span><span class="dl">'</span><span class="p">;</span>
<span class="k">export</span> <span class="kd">const</span> <span class="nx">ComponentNameDemo</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">return</span> <span class="p">(</span>
<span class="p"><></span>
<span class="p"><</span><span class="nt">h1</span><span class="p">></span>Demo of ComponentName<span class="p"></</span><span class="nt">h1</span><span class="p">></span>
<span class="p"><</span><span class="nc">ComponentName</span>
<span class="na">componentProperty</span><span class="p">=</span><span class="s">"some-property"</span>
<span class="p">/></span>
<span class="p"></></span>
<span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">ComponentNameDemo</code> will then need to be added to the <code class="language-plaintext highlighter-rouge">example/src/index.tsx</code> file for it to be present within the example app front end, for example:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentNameDemo</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">./components</span><span class="dl">'</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">App</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">(</span>
<span class="p"><</span><span class="nt">div</span><span class="p">></span>
<span class="p"><</span><span class="nc">BrowserRouter</span><span class="p">></span>
<span class="p"><</span><span class="nc">Switch</span><span class="p">></span>
<span class="p"><</span><span class="nc">Route</span> <span class="na">path</span><span class="p">=</span><span class="s">"/componentName"</span> <span class="na">exact</span> <span class="na">component</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentNameDemo</span><span class="si">}</span> <span class="p">/></span>
<span class="p"></</span><span class="nc">Switch</span><span class="p">></span>
<span class="p"></</span><span class="nc">BrowserRouter</span><span class="p">></span>
<span class="p"></</span><span class="nt">div</span><span class="p">></span>
<span class="p">);</span>
<span class="nx">ReactDOM</span><span class="p">.</span><span class="nx">render</span><span class="p">(<</span><span class="nc">App</span> <span class="p">/>,</span> <span class="nb">document</span><span class="p">.</span><span class="nx">getElementById</span><span class="p">(</span><span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">));</span>
</code></pre></div></div>
<p>In a terminal run the following to link the changes within the DCX React Library to the example folder</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>yarn
</code></pre></div></div>
<p>then</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd </span>example
</code></pre></div></div>
<p>finally</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>../setup.sh
</code></pre></div></div>
<p>this will also open http://localhost:3000 in your default browser</p>
<p><img src="/images/2022-05-27-contributing-to-dcx-library/contributing-to-dcx-library-example-demo.jpeg" alt="Example Demo Page" /></p>
<h3 id="updating-storybook-documentation">Updating storybook documentation</h3>
<p>Now that we’ve added our <code class="language-plaintext highlighter-rouge">ComponentName</code> we will now need to create a set of stories for our component.</p>
<p>There are four <code class="language-plaintext highlighter-rouge">.mdx</code> story files that we have for each component, which are:</p>
<h4 id="documentationstoriesmdx">Documentation.stories.mdx</h4>
<p>A file to add the general description of the component along with a full list of the props the component has.</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">import</span> <span class="p">{</span> <span class="nx">Meta</span><span class="p">,</span> <span class="nx">Story</span><span class="p">,</span> <span class="nx">Canvas</span><span class="p">,</span> <span class="nx">Props</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@storybook/addon-docs/blocks</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">../../src/componentName/ComponentName</span><span class="dl">'</span><span class="p">;</span>
<span class="p"><</span><span class="nc">Meta</span>
<span class="na">title</span><span class="p">=</span><span class="s">"DCXLibrary/ComponentName/documentation"</span>
<span class="na">component</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentName</span><span class="si">}</span>
<span class="na">parameters</span><span class="p">=</span><span class="si">{</span><span class="p">{</span>
<span class="na">viewMode</span><span class="p">:</span> <span class="dl">'</span><span class="s1">docs</span><span class="dl">'</span><span class="p">,</span>
<span class="na">previewTabs</span><span class="p">:</span> <span class="p">{</span>
<span class="na">canvas</span><span class="p">:</span> <span class="p">{</span> <span class="na">hidden</span><span class="p">:</span> <span class="kc">true</span> <span class="p">},</span>
<span class="p">},</span>
<span class="p">}</span><span class="si">}</span>
<span class="p">/></span>
<span class="c1">// ComponentName can be added here</span>
<span class="c1">// Here is where a general description of the component can be added</span>
<span class="c1">// Usage example added here</span>
<span class="p"><</span><span class="nc">ComponentName</span>
<span class="na">componentProperty</span><span class="p">=</span><span class="s">"some-property"</span>
<span class="p">/></span>
<span class="p"><</span><span class="nc">Props</span> <span class="na">of</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentName</span><span class="si">}</span> <span class="p">/></span>
</code></pre></div></div>
<h4 id="livestoriesmdx">Live.stories.mdx</h4>
<p>A file to add a live edit of the newly created component, which offers consumers of the library a place to edit the component to observe how it renders with a specific set of props, for example:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">import</span> <span class="p">{</span> <span class="nx">Meta</span><span class="p">,</span> <span class="nx">Story</span><span class="p">,</span> <span class="nx">Canvas</span><span class="p">,</span> <span class="nx">Props</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">@storybook/addon-docs/blocks</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">../../src/componentName/ComponentName</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="nx">ComponentNameLive</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">../liveEdit/ComponentNameLive</span><span class="dl">'</span><span class="p">;</span>
<span class="p"><</span><span class="nc">Meta</span>
<span class="na">title</span><span class="p">=</span><span class="s">"DCXLibrary/Form/ComponentName/live"</span>
<span class="na">component</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentName</span><span class="si">}</span>
<span class="na">parameters</span><span class="p">=</span><span class="si">{</span><span class="p">{</span>
<span class="na">viewMode</span><span class="p">:</span> <span class="dl">'</span><span class="s1">docs</span><span class="dl">'</span><span class="p">,</span>
<span class="na">previewTabs</span><span class="p">:</span> <span class="p">{</span>
<span class="na">canvas</span><span class="p">:</span> <span class="p">{</span> <span class="na">hidden</span><span class="p">:</span> <span class="kc">true</span> <span class="p">},</span>
<span class="p">},</span>
<span class="p">}</span><span class="si">}</span>
<span class="p">/></span>
<span class="c1">// ComponentName</span>
<span class="c1">// In the live editor you can play with all the available properties</span>
<span class="c1">// change the look and feel and interact with the component</span>
<span class="p"><</span><span class="nc">Canvas</span><span class="p">></span>
<span class="p"><</span><span class="nc">Story</span> <span class="na">name</span><span class="p">=</span><span class="s">"live"</span><span class="p">></span>
<span class="p"><</span><span class="nc">ComponentNameLive</span> <span class="p">/></span>
<span class="p"></</span><span class="nc">Story</span><span class="p">></span>
<span class="p"></</span><span class="nc">Canvas</span><span class="p">></span>
<span class="c1">// Properties</span>
<span class="c1">// below are described the list of all available properties.</span>
<span class="c1">// the one marked with (\*) are mandatory the other instead are optional.</span>
<span class="p"><</span><span class="nc">Props</span> <span class="na">of</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentName</span><span class="si">}</span> <span class="p">/></span>
</code></pre></div></div>
<p>Before creating the above, you will need to ensure that you have created the <code class="language-plaintext highlighter-rouge">liveEdit/ComponentNameLive.tsx</code> module, an example of this is:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">import</span> <span class="nx">React</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">react</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">LiveProvider</span><span class="p">,</span> <span class="nx">LiveEditor</span><span class="p">,</span> <span class="nx">LiveError</span><span class="p">,</span> <span class="nx">LivePreview</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">react-live</span><span class="dl">'</span><span class="p">;</span>
<span class="k">import</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">../../src/componentName/ComponentName</span><span class="dl">'</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">ComponentNameDemo</span> <span class="o">=</span> <span class="s2">`
function ComponentNameDemo() {
return (
<ComponentNameDemo
componentProperty="some-property"
/>
)
}
`</span><span class="p">.</span><span class="nx">trim</span><span class="p">();</span>
<span class="kd">const</span> <span class="nx">ComponentNameLive</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">scope</span> <span class="o">=</span> <span class="p">{</span> <span class="nx">ComponentName</span> <span class="p">};</span>
<span class="k">return</span> <span class="p">(</span>
<span class="p"><</span><span class="nc">LiveProvider</span> <span class="na">code</span><span class="p">=</span><span class="si">{</span><span class="nx">ComponentNameDemo</span><span class="si">}</span> <span class="na">scope</span><span class="p">=</span><span class="si">{</span><span class="nx">scope</span><span class="si">}</span><span class="p">></span>
<span class="p"><</span><span class="nt">div</span> <span class="na">className</span><span class="p">=</span><span class="s">"container"</span><span class="p">></span>
<span class="p"><</span><span class="nc">LiveEditor</span> <span class="na">className</span><span class="p">=</span><span class="s">"liveEditor"</span> <span class="na">aria-label</span><span class="p">=</span><span class="s">"editor"</span> <span class="p">/></span>
<span class="p"><</span><span class="nc">LivePreview</span> <span class="na">className</span><span class="p">=</span><span class="s">"livePreview"</span> <span class="na">aria-label</span><span class="p">=</span><span class="s">"preview"</span> <span class="p">/></span>
<span class="p"></</span><span class="nt">div</span><span class="p">></span>
<span class="p"><</span><span class="nc">LiveError</span> <span class="na">className</span><span class="p">=</span><span class="s">"liveError"</span> <span class="na">aria-label</span><span class="p">=</span><span class="s">"error"</span> <span class="p">/></span>
<span class="p"></</span><span class="nc">LiveProvider</span><span class="p">></span>
<span class="p">);</span>
<span class="p">};</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">ComponentNameLive</span><span class="p">;</span>
</code></pre></div></div>
<h4 id="styledstoriesmdx">Styled.stories.mdx</h4>
<p>A file where styled stories of <code class="language-plaintext highlighter-rouge">ComponentName</code> can be added, all styles can be added to the <code class="language-plaintext highlighter-rouge">stories/style.css</code> file.</p>
<p>For scoped styles, a <code class="language-plaintext highlighter-rouge">style.css</code> file can be created within the <code class="language-plaintext highlighter-rouge">ComponentName</code> stories directory, with specific styles added here i.e., <code class="language-plaintext highlighter-rouge">stories/ComponentName/styles.css</code></p>
<h4 id="unstyledstoriesmdx">Unstyled.stories.mdx</h4>
<p>A file where un-styled stories of <code class="language-plaintext highlighter-rouge">ComponentName</code> can be added, these stories will contain basic usage of the component with no styles applied.</p>
<h3 id="enhancing-an-existing-component">Enhancing an existing component</h3>
<p>Within the project’s GitHub <a href="https://github.com/Capgemini/dcx-react-library/projects/2">project board</a> we have a number of enhancements that we want to implement for the upcoming 0.5 release with details of the desired changes, below is a snippet of a <a href="https://github.com/Capgemini/dcx-react-library/issues/242">previous enhancement</a> made to the <a href="https://main--6069a6f47f4b9f002171f8e1.chromatic.com/?path=/docs/dcxlibrary-form-select-documentation--page">FormSelect</a> component in the 0.4 release.</p>
<p>Currently when you want to pass the <code class="language-plaintext highlighter-rouge">options</code> you need to specify the value and the label.</p>
<div class="language-jsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="nx">options</span><span class="o">=</span><span class="p">{[{</span>
<span class="na">label</span><span class="p">:</span> <span class="dl">'</span><span class="s1">Recently published</span><span class="dl">'</span><span class="p">,</span>
<span class="na">value</span><span class="p">:</span> <span class="dl">'</span><span class="s1">published</span><span class="dl">'</span>
<span class="p">}]}</span>
</code></pre></div></div>
<p>This is perfect in cases where it is not necessary to have a different label from the value.</p>
<div class="language-jsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="nx">options</span><span class="o">=</span><span class="p">{[</span><span class="dl">'</span><span class="s1">a</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">b</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">c</span><span class="dl">'</span><span class="p">]}</span>
</code></pre></div></div>
<p>When making such changes it is important to ensure we do not remove or break any pre-existing functionality, but we need to add extra properties to support desired enhancements.</p>
<h3 id="resolving-bugs">Resolving bugs</h3>
<p>As the number of consumers of the library increases, we may find bugs within the implemented components, we encourage consumers to raise bugs on the project’s GitHub list of <a href="https://github.com/Capgemini/dcx-react-library/issues?q=is%3Aissue+label%3Abug">bugs</a> page.</p>
<p>Participating in this way offers the opportunity to investigate issues to find a solution, which often is the best way to learn a new codebase.</p>
<h3 id="improving-accessibility">Improving accessibility</h3>
<p>All components are tested for accessibility and as we grow the list of components, we aim to ensure that all components meet WCAG 2.0 accessibility standards.</p>
<p>If any accessibility bugs are found, we encourage consumers to raise issues on the project’s GitHub list of <a href="https://github.com/Capgemini/dcx-react-library/issues?q=is%3Aissue+label%3Aa11y">accessibility issues</a>.</p>
<h2 id="thinking-of-contributing">Thinking of contributing?</h2>
<ul>
<li>
<p>If you would like to know more about the library, feel free to contact <a href="daniele.zurico@capgemini.com">Daniele Zurico</a> or <a href="isaac.babalola@capgemini.com">Isaac Babalola</a>.</p>
</li>
<li>
<p>If you are interested in using the library, it is now publicly available on <a href="https://www.npmjs.com/package/@capgeminiuk/dcx-react-library">npm</a>.</p>
</li>
<li>
<p>If you would like to contribute, you can do so by forking the <a href="https://github.com/Capgemini/dcx-react-library">public repository</a>.</p>
</li>
<li>
<p>If you would like to familiarise yourself with all of the built components, please take a look at the <a href="https://main--6069a6f47f4b9f002171f8e1.chromatic.com">storybook documentation</a>.</p>
</li>
</ul>
<p><a href="https://capgemini.github.io/development/contributing-to-dcx-react-library/">Contributing to the DCX React Library</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 27, 2022.</p>https://capgemini.github.io/development/the-efficient-cloud-era2022-05-20T00:00:00+01:002022-05-20T00:00:00+01:00Sarah Saundershttps://capgemini.github.io/authors#author-sarah-saunders
<p>The main theme at <a href="https://devoxx.co.uk">Devoxx UK</a> this year was all about getting Java to be fast and lean in the cloud.</p>
<p>From improving startup time to allow serverless Java apps, to enabling scale-to-zero, to ensuring your application is running efficiently, many of the talks at the Devoxx UK 2022 conference were really focussing on a net outcome of reducing cost, which in turn will reduce power requirements for IT estates and aid sustainability.</p>
<h2 id="back-to-the-buzz">Back to the Buzz</h2>
<p>After a nervous 2-day hybrid conference last year, Devoxx was back to its buzzing self this year with speakers delighted to be back in front of audiences and the sponsors’ booths once again bursting with free gifts and chat. Capgemini were gold sponsors this year, with engineers <a href="https://capgemini.github.io/authors/#author-kevin-rudland">Kevin Rudland</a> and <a href="https://capgemini.github.io/authors/#author-chris-burns">Chris Burns</a> giving their talk “How to get Hacker Kids to Max Out your AWS account in 10 hours, and other reasons to focus on your Secure Software Supply Chain” (more on that later).</p>
<p><img src="/images/2022-05-17-the-efficient-cloud-era/stand.jpg" alt="Capgemini's stall at Devoxx 2022" /></p>
<p>Our T shirt and mug freebies flew off the shelf, and our vegan, palm-oil free pick ‘n’ mix was popular in the post-lunch lull. Fantastic Capgemini AIE artist <a href="https://uk.linkedin.com/in/jack-ambrose">Jack Ambrose</a> was once again on hand to help people visualise “Getting the future they want”.
<img src="/images/2022-05-17-the-efficient-cloud-era/art.jpg" alt="Jack Ambrose art" /></p>
<h2 id="efficiency-at-the-fore">Efficiency at the Fore</h2>
<p>There isn’t time to go to all the talks at Devoxx so this article is skewed by my choices, however there were many talks around similar themes: improving the efficiency and speed of Java applications, deploying to the cloud with Kubernetes and Knative being prevalent. I’ve listed here some strong themes and great facts from the talks that I attended over the three days.</p>
<h3 id="kubernetes-by-default">Kubernetes by Default</h3>
<p>Everyone but everyone was talking about deploying with <a href="https://kubernetes.io/">Kubernetes</a>. This may have been directly or using <a href="https://knative.dev/">KNative</a>. I attended “<a href="https://www.youtube.com/watch?v=1_sJVbabBgk">Fantastic Java apps and how to kubefy them with Dekorate</a>”, a live coding demo showing how the Dekorate annotations could be used to generate your Kubernetes manifest files, allowing Java devs to reduce the number of languages and syntaxes they need to get their heads around to create a Kubernetes runtime. Our Capgemini talk also suggests Kubernetes-as-a-service as the best abstraction layer between your own deployment artifacts (ie containers) and your cloud provider platform. There were talks specifically focussing on improving the sustainability of Kubernetes clusters using schedulers - for example “<a href="https://www.youtube.com/watch?v=MzaMBfYbvss">Sustainability in software engineering - today and tomorrow</a>”. In this talk, speaker Martin Lippert refers to a <a href="https://www.anthesisgroup.com/wp-content/uploads/2019/11/Comatose-Servers-Redux-2017.pdf">2019 report</a> suggesting a quarter of data centre servers are “zombie servers” - running and using electricity but hosting no active applications. Unfortunately it seems the same applies to virtual mchines; suggesting that if we really want to reduce our power footprint for our estates, we NEED an auto-scaling platform such as Kubernetes, and we probably need a hyperscaler capable of managing the underlying machines when not in use.</p>
<h3 id="speed-up-your-start-time">Speed up your Start Time</h3>
<p>There has been a real buzz around <a href="https://www.graalvm.org">GraalVM</a> in recent years at Devoxx and the wider Java industry, looking at how it can improve the startup time of your Java application. This year the speakers drilled even deeper into how to speed up an app’s start time, without losing its efficiency. My favourite talk in this area, the catchily-named “<a href="https://www.youtube.com/watch?v=0evEs_3yaEI">Java on CRaC</a>” (CRaC = Co-Ordinated Restore at Checkpoint, I rather suspect the initials were chosen first…) looked into how we could start applications with the speed of a native image without losing the efficiency savings which come from running the Just-in-Time (JIT) compiler. In summary, Java bytecode runs on a JVM - Java virtual machine - which, at startup of the application, compiles frequently-executed code to native machine code. This takes a while, and to speed things up it’s possible with GraalVM to use Ahead-of-Time (AOT) compilation and run this slow process of creating a native image BEFORE the application starts. A great use-case for this is serverless functions - to be as efficient as possible with our compute time, we’d like a serverless function to scale down to zero instances in production until it’s called, then spin up in a timely manner and execute our call when we want it.
There is, of course, a downside to AOT. Creating your native image before startup means the application can’t be as effectively profiled to identify “hot-spots” so overall performance is typically lower. According to the talk, applications started from an AOT image are about 0.6 times the speed of a JIT-compiled application; although you can raise this to about 0.8 times the speed with some extra performance evaluation during compilation.</p>
<p>The answer suggested by this talk was to start an application with the JIT compiler, but then “freeze” it once it was running and save that frozen state. Future starts of the application could use the frozen state kind of like an AOT image, meaning you get all the benefits of JIT compliation and also instant start-up. The statistics shown in the talk were impressive to say the least. Java apps on CRaC start up 2 orders of magnitude faster.
<img src="/images/2022-05-17-the-efficient-cloud-era/slide.jpg" alt="CRaC 2 orders of magnitude faster to start" /></p>
<h3 id="shrinking-your-apps">Shrinking your Apps</h3>
<p><a href="https://quarkus.io/">Quarkus</a> was even more omnipresent at Devoxx - with the “Quarkus World Tour” on the RedHat stand and numerous talks - including “Integrating systems in the age of Quarkus, serverless and Kafka”, “Migrating a Spring Boot app to Quarkus, challenge accepted”. The message is clear, the modern focus is on improving the speed and footprint of your application; being in the cloud is a given, the next stage is being the best cloud app that you can.
There is a warning though about blindly focussing on your applications. I attended a <a href="https://www.youtube.com/watch?v=q4Fd3_u_kXw">very interesting talk</a> on how to tune your Java virtual machine (JVM) to better support a container runtime - have you ever thought about how Kubernetes allocates CPUs to the container where your JVM is running? Kubernetes allocates a time slice of a physical CPU to the container, and the JVM translates these “minicores” into processes. Assumptions about the best garbage collection model to use (serial or parallel? C1 or C2?) are made by the JVM based on its perception of how many CPUs it has available - and when this is abstracted by a container and Kubernetes, the JVM often gets this wrong. The consequences of this are, for example, serial garbage collection freezing your entire application instead of utilising the multiple processors you may be paying for. For more information on JVM tuning for Kubernetes, the speaker recommended checking out <a href="https://www.infoq.com/interviews/beckwith-garbage-collection/">Monika Beckwith’s tuning video on InfoQ</a>. You can also use the <a href="https://docs.oracle.com/javacomponents/jmc-5-4/jfr-runtime-guide/about.htm#JFRUH170">Java Flight Recorder</a> to better understand what your JVM is doing.</p>
<h3 id="minimising-your-estate">Minimising your Estate</h3>
<p>Another angle to cloud efficiency that came through strongly at Devoxx was the serverless model; the ability to scale applications to zero when not in use. And this doesn’t just include business applications - support applications were also considered. In his talk on <a href="https://www.youtube.com/watch?v=SYO-LmA647E">Java observability</a>, RedHat’s <a href="https://developers.redhat.com/author/ben-evans">Ben Evans</a> talked about how some companies could have, for example, 200 servers just for analysing log data! Think about it - you need storage active all the time to capture your log data, sure, (and of course not on the machines that generate the log messages - hard to investigate an outage if your logs were on the machine that went down!) and you need listeners ready to send monitoring/alerting messages, but for complex log analysis you should be able to just spin up the data-reading machines when you need them.
Many of the talk demos used KNative to demonstrate apps scaling up/down from zero instances based on load. Our Capgemini talk discusses our CREATE accelerator which spins up the whole development environment - pipelines and all - on checkin of a change. Think about it - do you have Jenkins/Concourse build servers sitting there and eating expensive CPU time when they’re not being used? Not necessary!</p>
<h3 id="more-than-just-a-nod-to-security">More than just a nod to security</h3>
<p>As expected, especially with <a href="https://snyk.io">Snyk</a> as platinum sponsors, there were several talks focussing on application security. I went to one detailing some interesting but unlikely ways that, given an unfortunate series of events, deserialization of Java objects or even of JSON could lead to injection attacks allowing hackers to launch applications on your machine.
<a href="https://www.youtube.com/watch?v=qJfDh00c6fs">Our own talk</a> focussed on ensuring a secure software supply chain. It’s worthy of a blog in itself, and indeed the talk could have been several talks.</p>
<p>To summarise:</p>
<ul>
<li>Ensure minimum permissions for every communication</li>
<li>Ensure the provenance of your artifacts,</li>
<li>Aim for zero-trust but be aware it’s idealistic and you may not be able to achieve it,</li>
<li>Be aware that the biggest security threat to your system is YOU!</li>
</ul>
<p>Capgemini Software Engineering have done it all for you with our cloud accelerator, a series of open-source products tied into an architecture which will spin you up an entire cloud-based secure software supply chain for your development needs - and allow you to tear it all down whenever it’s not in use. <a href="mailto:sarah.saunders@capgemini.com">Get in touch with us</a> for more information!</p>
<h3 id="and-just-better-apps">And, just… Better Apps!</h3>
<p>Another talk I really enjoyed was <a href="https://www.youtube.com/watch?v=eFheAErqJzA">Functional Programming in Kotlin - exploring Arrow</a>. I’d seen a little of Kotlin before but this talk came in from the angle of problem-solving: Have you been bitten before by support issues involving NullPointerExceptions or ArrayIndexOutOfBoundExceptions? Of course you have! Speaker Ties van de Ven had, too. It was in search of a solution to this pain that he discovered Kotlin’s Arrow library and how it can find these exceptions - at compile time! Yes, really. Using monads (Quote: “If you know what a monad is, you can’t describe it”…) to define a return type that is EITHER an exception OR the value you were looking for as a starting point, you are then forced to deal with the two circumstances. Or you can go a step further and use <a href="https://arrow-kt.io/docs/analysis/">Arrow Analysis</a> library to run pre/post condition checks at compile time.</p>
<p>So, if you built precondition checks that n>0 into your divide(n) function, this code will compile:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="k">if</span><span class="o">(</span><span class="n">a</span><span class="o">></span><span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
<span class="n">divide</span><span class="o">(</span><span class="n">a</span><span class="o">);</span>
<span class="o">}</span></code></pre></figure>
<p>But, this code won’t:</p>
<figure class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">divide</span><span class="o">(</span><span class="n">a</span><span class="o">);</span></code></pre></figure>
<p>Wow! And there was me thinking Kotlin was all syntactic sugar and writing less code. I’m a convert.</p>
<p><a href="https://capgemini.github.io/development/the-efficient-cloud-era/">The Efficient Cloud Era</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 20, 2022.</p>https://capgemini.github.io/development/next.js-fundamentals-through-examples2022-02-11T00:00:00+00:002022-02-11T00:00:00+00:00Shemsedin Callakihttps://capgemini.github.io/authors#author-shemsedin-callaki
<h2 id="introduction">Introduction</h2>
<p>As static site generation becomes more and more popular so does the need for the
right tools and frameworks.</p>
<p>More often than not there are times when we need lightweight tools such as React to
consume a decoupled service and serve static pages. There are a lot of benefits of having a
tool to generate static pages for various different reasons such as speed, caching using CDN, SEO etc.</p>
<p>The purpose of this blog post though is to explain <a href="https://nextjs.org">Next.js</a> fundamentals -
this great React framework that is growing in popularity. We will pick some of its features and will explain with
examples and by doing so hopefully it will give readers a good grasp of the framework.</p>
<h2 id="assumption">Assumption</h2>
<p>As we will explain some of the Next.js features starting from beginning, the assumption
is that you are familiar with React or JavaScript.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>Next.js requires Node.js to be installed. If you have already installed Node.js to check the version you have run <code class="language-plaintext highlighter-rouge">node -v</code>
on your terminal and compare with the latest version <a href="https://nodejs.org/">Node.js</a> - the minimum supported version is Node.js 12.22.0.</p>
<h2 id="what-is-nextjs">What is Next.js</h2>
<p>Next.js is a React based framework built on top of Node.js, with Next.js you can do a wide range
of things from creating APIs to consuming external/internal APIs to server side rendering, static generation
and a lot more.</p>
<p>As we know React is a hugely popular library, see <a href="https://insights.stackoverflow.com/trends?tags=reactjs%2Cvue.js%2Cangular">some statistics here</a>,
but it is only the view in the <a href="https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller">MVC (Model View Controller)</a>. Next.js on the
other hand is framework that is built on top of React and Node.js. <a href="https://reactjs.org/docs/create-a-new-react-app.html#nextjs">The React documentation</a>
makes mention of Next.js as one of recommended toolchain.</p>
<h2 id="basic-features">Basic Features</h2>
<p>There are lots of features that Next.js supports out the box but here I will only focus on
some of them such as:</p>
<ul>
<li>TypeScript</li>
<li>Code Splitting</li>
<li>Routing</li>
<li>Static Generation</li>
<li>Data fetching</li>
</ul>
<p>Without further ado let’s get started on creating the application.</p>
<h2 id="creating-a-nextjs-application">Creating a Next.js Application</h2>
<p>Next.js supports <a href="https://www.typescriptlang.org/">TypeScript</a> out of the box, which means
you don’t need to do any additional configuration.</p>
<p>To create we can use the following command:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx create-next-app
<span class="c"># or</span>
yarn create next-app
</code></pre></div></div>
<p>For those of you not familiar, <code class="language-plaintext highlighter-rouge">npx</code> is a package runner and CLI tool which makes it easy to
install and mange dependencies hosted in npm registry. <code class="language-plaintext highlighter-rouge">yarn</code> is a package manager.</p>
<h2 id="typescript">TypeScript</h2>
<p>As we mentioned above TypeScript is fully supported in Next.js and to create an application that
uses TypeScript you would type the following command:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx create-next-app <span class="nt">--ts</span>
<span class="c">#or</span>
yarn create next-app <span class="nt">--typescript</span>
</code></pre></div></div>
<p>As you can see we use <code class="language-plaintext highlighter-rouge">--ts</code> and <code class="language-plaintext highlighter-rouge">--typescript</code> flags to tell the CLI tool <code class="language-plaintext highlighter-rouge">create-next-app</code> to create the
application using TypeScript.</p>
<p>Now let’s create our application by navigating to your preferred directory and type this
command on your terminal:</p>
<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>npx create-next-app nextjs-with-typescript <span class="nt">--ts</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">nextjs-with-typescript</code> is the name of our application.</p>
<p>The above script will install all the necessary dependencies and when it finishes
will print in the screen some commands that you can use to run the application. That’s all it is
to that, you will now be able to run the application without extra configuration that you would normally
need in order to compile TypeScript.</p>
<h2 id="code-splitting">Code Splitting</h2>
<p>Code splitting is an optimisation technique that splits the code in chunks or small bundles
which then can be loaded on demand or in parallel, this way it enables the application to load
a lot faster.</p>
<p>The importance of code splitting can be best seen on growing application, as the application
grows so does the size of the JavaScript file.</p>
<h3 id="nextjs-code-splitting">Next.js Code Splitting</h3>
<p>Next.js has built-in support for code splitting, that means you don’t have to use any
external plugins such as <a href="https://babeljs.io">babel</a>.</p>
<p>When loading the page Next.js only loads the JavaScript necessary for that particular page.
Next.js does this by analysing the resources that is importing. If for example one of your pages
makes use of <code class="language-plaintext highlighter-rouge">axios</code> library, then that specific page will include <code class="language-plaintext highlighter-rouge">axios</code> in its bundle. In this way
we make sure that we only send the JavaScript needed to the client.</p>
<p>Next.js also supports <strong>dynamic import()</strong>, this feature makes it possible to import
JavaScript modules dynamically and load each import as a separate chunks. To get an
understanding how that is done you can have a look at your application’s built directory
which is <code class="language-plaintext highlighter-rouge">.next</code>.</p>
<p>As we mentioned earlier, when you create the application it will generate a README file
which has some basic information such as how to run and build the application, for convenience,
I’ll list some commands here:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># to run the application</span>
npm dev
<span class="c"># or </span>
yarn dev
<span class="c"># to build the application</span>
npm build
<span class="c"># or </span>
yarn build
</code></pre></div></div>
<p>In order to see the built folder you would need to build your application.<br />
Once that done you can then navigate to <code class="language-plaintext highlighter-rouge">.next</code> directory, there you will see something
like the following:</p>
<pre><code class="language-.next/static">|- chunks
|- {someNumber}.{hash}.js
|- commons.{hash}.js
|- runtime
|- main-{hash}.js
|- webpack-{hash}.js
|- {hash}/pages
|- _app.txs
|- _error.txs
|- index.txs
</code></pre>
<p>As we can see from the above, the code splitting is done by chunks, runtime and by page.</p>
<h2 id="routing">Routing</h2>
<p>Routing is another feature that Next.js supports out of the box. Next.js uses the file system
to enable routing, every file that you put under the <code class="language-plaintext highlighter-rouge">pages</code> directory with the extension
<code class="language-plaintext highlighter-rouge">.js</code>, <code class="language-plaintext highlighter-rouge">.jsx</code>, <code class="language-plaintext highlighter-rouge">ts</code> or <code class="language-plaintext highlighter-rouge">tsx</code> automatically becomes a route.</p>
<p><strong>Pages</strong>
A Next.js page is a React Component, in
our application that we created earlier on we are going to go ahead and create a
page under <code class="language-plaintext highlighter-rouge">pages</code> directory called <code class="language-plaintext highlighter-rouge">about.tsx</code> as follows:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/about.tsx</span>
<span class="kd">function</span> <span class="nx">About</span><span class="p">(){</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">h1</span><span class="p">></span>About<span class="p"></</span><span class="nt">h1</span><span class="p">></span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">About</span>
</code></pre></div></div>
<p>The above is a React Component that simply returns a <code class="language-plaintext highlighter-rouge">h1</code> heading. Now If we run the application
and go to <code class="language-plaintext highlighter-rouge">/about</code> route, you will see the about page that has <code class="language-plaintext highlighter-rouge">About</code> tag that we just created.
That’s how easy it is to create a route.</p>
<h3 id="index-routes">Index Routes</h3>
<p>In Next.js a file named <code class="language-plaintext highlighter-rouge">index.ts</code> or <code class="language-plaintext highlighter-rouge">index.js</code> in the root directory of any directory under
the <code class="language-plaintext highlighter-rouge">pages</code> directory, will automatically be a route.</p>
<p>Here are some examples:</p>
<ul>
<li>Creating an <code class="language-plaintext highlighter-rouge">index.ts</code> page at <code class="language-plaintext highlighter-rouge">pages/index.ts</code> will create a route <code class="language-plaintext highlighter-rouge">/</code>.</li>
<li>Creating a page at <code class="language-plaintext highlighter-rouge">pages/blog/index.ts</code> will create a route at <code class="language-plaintext highlighter-rouge">/blog</code></li>
</ul>
<h3 id="nested-routes">Nested Routes</h3>
<p>If we need to create a nested structure then under <code class="language-plaintext highlighter-rouge">pages</code> directory we would create directories and
files which would then map to the routes. Here are some examples:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">pages/articles/my-first-article.tsx</code> will create a route <code class="language-plaintext highlighter-rouge">/articles/my-first-article</code>.</li>
<li><code class="language-plaintext highlighter-rouge">pages/admin/settings/user.tsx</code> will create a route at <code class="language-plaintext highlighter-rouge">/admin/settings/user</code></li>
</ul>
<p>To demonstrate this we will create the following two pages in our app.</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/articles/my-first-article.tsx</span>
<span class="kd">function</span> <span class="nx">MyFirstArticle</span><span class="p">(){</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">h1</span><span class="p">></span>My First Article<span class="p"></</span><span class="nt">h1</span><span class="p">></span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">MyFirstArticle</span>
</code></pre></div></div>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/admin/settings/user.tsx</span>
<span class="kd">function</span> <span class="nx">User</span><span class="p">(){</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">h1</span><span class="p">></span>User<span class="p"></</span><span class="nt">h1</span><span class="p">></span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">User</span>
</code></pre></div></div>
<p>Now if you run the application and navigate to <code class="language-plaintext highlighter-rouge">/articles/my-first-article</code> or <code class="language-plaintext highlighter-rouge">/admin/settings/user</code>
you will see the above components being served respectively.</p>
<h3 id="dynamic-routes">Dynamic Routes</h3>
<p>As explained above routes are defined based on the file and folders that we create
under <code class="language-plaintext highlighter-rouge">pages</code> i.e. every file in there maps to a route.</p>
<p>Having said that, there are lots of cases in more complex applications that predefined
routes are not enough and hence where the dynamic routes come in.</p>
<p>To create dynamic routes you can use square brackets in the name of the file
like so <code class="language-plaintext highlighter-rouge">[param]</code>.</p>
<p>In the following we are going to create a dynamic route so when people go to <code class="language-plaintext highlighter-rouge">articles/<id></code>,
the article id will be sent as a query parameter to the page, in turn we can get this article id
using router query object and do further processing.</p>
<p>Let’s start by creating <code class="language-plaintext highlighter-rouge">[aid].tsx</code> file under <code class="language-plaintext highlighter-rouge">pages/articles/</code> like the following:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/articles/[aid].tsx</span>
<span class="k">import</span> <span class="p">{</span><span class="nx">useRouter</span><span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">next/router</span><span class="dl">'</span>
<span class="kd">const</span> <span class="nx">Article</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">router</span> <span class="o">=</span> <span class="nx">useRouter</span><span class="p">()</span>
<span class="kd">const</span> <span class="p">{</span><span class="nx">aid</span><span class="p">}</span> <span class="o">=</span> <span class="nx">router</span><span class="p">.</span><span class="nx">query</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">p</span><span class="p">></span>Article id: <span class="si">{</span><span class="nx">aid</span><span class="si">}</span><span class="p"></</span><span class="nt">p</span><span class="p">>;</span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">Article</span><span class="p">;</span>
</code></pre></div></div>
<p>The route <code class="language-plaintext highlighter-rouge">articles/3</code> will be matched by <code class="language-plaintext highlighter-rouge">[aid].tsx</code> so now If you go to
<code class="language-plaintext highlighter-rouge">articles/3</code> it will display <strong>Article id: 3</strong>. The route <code class="language-plaintext highlighter-rouge">articles/3</code> will
have this query object <code class="language-plaintext highlighter-rouge">{aid:'3'}</code>. The <code class="language-plaintext highlighter-rouge">id</code> can be anything that serves best your
needs that is to say it can be a string a number etc.</p>
<p>If the route has <code class="language-plaintext highlighter-rouge">articles/3?foo=bar</code> then router query object will have <code class="language-plaintext highlighter-rouge">{foo:'bar', aid:'3'}</code>, i.e. if
you do <code class="language-plaintext highlighter-rouge">console.log(router.query)</code> you will then see the above values.</p>
<h3 id="nested-multiple-dynamic-routes">Nested Multiple Dynamic Routes</h3>
<p>In cases where you need two levels of the route to be dynamic such as <code class="language-plaintext highlighter-rouge">articles/3/a-comment</code>
then you would create a folder and another file under that folder. Say for example
you want to capture the article id and its comment something like this
<code class="language-plaintext highlighter-rouge">http://localhost:3000/<id>/<comment></code> then in this case would create a directory <code class="language-plaintext highlighter-rouge">[aid]</code> under the
<code class="language-plaintext highlighter-rouge">pages</code> and a TypeScript file under <code class="language-plaintext highlighter-rouge">[aid]</code> so then you would have this structure
<code class="language-plaintext highlighter-rouge">pages/articles/[aid]/[comment].tsx</code>.</p>
<p>Then in the <code class="language-plaintext highlighter-rouge">[comment].tsx</code> file put the following:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/articles/[aid]/[comment].tsx</span>
<span class="k">import</span> <span class="p">{</span><span class="nx">useRouter</span><span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">next/router</span><span class="dl">'</span>
<span class="kd">const</span> <span class="nx">Comment</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">router</span> <span class="o">=</span> <span class="nx">useRouter</span><span class="p">()</span>
<span class="kd">const</span> <span class="p">{</span><span class="nx">comment</span><span class="p">}</span> <span class="o">=</span> <span class="nx">router</span><span class="p">.</span><span class="nx">query</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">p</span><span class="p">></span>Comment: <span class="si">{</span><span class="nx">comment</span><span class="si">}</span><span class="p"></</span><span class="nt">p</span><span class="p">>;</span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">Comment</span><span class="p">;</span>
</code></pre></div></div>
<p>If you go to this route now <code class="language-plaintext highlighter-rouge">articles/3/a-comment</code> the query will have</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{comment:'a-comment', aid:'3'}
</code></pre></div></div>
<p>As inside directory <code class="language-plaintext highlighter-rouge">[aid]</code> we are mapping several routes then one way to tackle this
to create an index file inside <code class="language-plaintext highlighter-rouge">[aid]</code> directory which would match <code class="language-plaintext highlighter-rouge">/article/<id></code> and another
one in our case <code class="language-plaintext highlighter-rouge">[comment].tsx</code> to map <code class="language-plaintext highlighter-rouge">articles/<id>/<comment></code>. The folder structure would then
be like the below:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>.
├── [aid]
│ ├── [comment].tsx
│ └── index.tsx
└── my-first-article.tsx
</code></pre></div></div>
<h3 id="catching-all-routes">Catching all Routes</h3>
<p>In cases when you want to catch all routes, then first you would create a file under
the preferred directory in our case <code class="language-plaintext highlighter-rouge">pages/[...slug]</code> like so:</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// file pages/[...slug].tsx</span>
<span class="k">import</span> <span class="p">{</span><span class="nx">useRouter</span><span class="p">}</span> <span class="k">from</span> <span class="dl">'</span><span class="s1">next/router</span><span class="dl">'</span>
<span class="kd">const</span> <span class="nx">CatchAll</span> <span class="o">=</span> <span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">router</span> <span class="o">=</span> <span class="nx">useRouter</span><span class="p">()</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">p</span><span class="p">></span>This page catches all routes<span class="p"></</span><span class="nt">p</span><span class="p">></span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">CatchAll</span><span class="p">;</span>
</code></pre></div></div>
<p>If you now run the application and navigate to <code class="language-plaintext highlighter-rouge">http://localhost:3000/a</code> it will display
the page that we created above.</p>
<p>If you navigate to <code class="language-plaintext highlighter-rouge">http://localhost:3000/a/b</code> and observer the query like so <code class="language-plaintext highlighter-rouge">console.log(router.query)</code> then
you will notice that slug now has the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"slug": [
"a",
"b"
]
}
</code></pre></div></div>
<h2 id="static-generation">Static Generation</h2>
<p>One of the great features that Next.js has is static site generation. There are
a lot of frameworks can generate static sites, but what makes Next.js different to other SSG frameworks
is the fact that Next.js is hybrid tool which can generate HTML/CSS/JavaScript at run time as well as at build time,
this and lots of other features makes Next.js a truly great React SSG framework.</p>
<p>Next.js has two form of pre-rendering, <strong>Static Generation</strong> and <strong>Server-side Rendering</strong>. The
difference between the two is when static assets, such as HTML, JavaScript, CSS etc., are generated.</p>
<p>With the Static Generation HTML pages are generated at build time and after that they
are reused at each request, whereas with Server-side Rendering the pages are generated
at each request.</p>
<p>If Static Generation is used pages will be generated when you run <code class="language-plaintext highlighter-rouge">next build</code> and from there
you can use a CDN if you want to cache the assets.</p>
<h2 id="static-generation-without-data-fetching">Static Generation without Data Fetching</h2>
<p>This is a simple rendering of static pages such at the page that we have created at the
beginning <code class="language-plaintext highlighter-rouge">my-first-article.tsx</code></p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">MyFirstArticle</span><span class="p">(){</span>
<span class="k">return</span> <span class="p"><</span><span class="nt">h1</span><span class="p">></span>My First Article<span class="p"></</span><span class="nt">h1</span><span class="p">></span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">MyFirstArticle</span>
</code></pre></div></div>
<p>Once we created this page and build the application this page will be available as am HTML page.</p>
<h2 id="data-fetching-and-static-generation">Data Fetching and Static Generation</h2>
<p>In cases where you have to fetch the data from an API or similar then you would
use built-in function called <code class="language-plaintext highlighter-rouge">getStaticProps</code> to fetch the data at build time and
then serve static pages.</p>
<p>If you are also catching paths then you would need to use <code class="language-plaintext highlighter-rouge">getStaticPaths</code> built-in function.
This function can be used in addition to the <code class="language-plaintext highlighter-rouge">getStaticProps</code>.</p>
<h3 id="data-fetching-and-static-generation-working-example">Data Fetching and Static Generation Working Example</h3>
<p>In the following will got through an example to illustrate this. Let’s say we are
capturing article data from an API the way we would implement that is to first fetch
the data using <code class="language-plaintext highlighter-rouge">getStaticProps</code> then we would have another function to consume those data.</p>
<p>In the following example we will use an example endpoint <code class="language-plaintext highlighter-rouge">https://jsonplaceholder.typicode.com/posts</code>
that has random posts.</p>
<div class="language-tsx highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">function</span> <span class="nx">Article</span><span class="p">({</span><span class="nx">props</span><span class="p">})</span> <span class="p">{</span>
<span class="c1">// Here you can further work with props and manipulate data as required</span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">async</span> <span class="kd">function</span> <span class="nx">getStaticProps</span><span class="p">()</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">res</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">fetch</span><span class="p">(</span><span class="dl">'</span><span class="s1">https://jsonplaceholder.typicode.com/posts</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">articles</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">res</span><span class="p">.</span><span class="nx">json</span><span class="p">()</span>
<span class="k">return</span> <span class="p">{</span>
<span class="na">props</span><span class="p">:</span> <span class="p">{</span>
<span class="nx">articles</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">export</span> <span class="k">default</span> <span class="nx">Article</span><span class="p">;</span>
</code></pre></div></div>
<p>In the above functions we are catching all the data from the endpoint and returning
them as <code class="language-plaintext highlighter-rouge">props</code>, then in turn we are passing the <code class="language-plaintext highlighter-rouge">props</code> to the above function as a parameter where
we can do further processing.</p>
<h2 id="conclusion">Conclusion</h2>
<p>If you are familiar with JavaScript, Next.js is very easy to use and learn. Out of the box
support for TypeScript, server-side rendering, static page generation and lots more, makes Next.js
a very strong contender to use for your next project, be it a blog or a complex application that
consumes an internal/external API.</p>
<p><a href="https://capgemini.github.io/development/next.js-fundamentals-through-examples/">Next.js Fundamentals Through Examples</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on February 11, 2022.</p>https://capgemini.github.io/development/dcx-react-library2022-01-21T00:00:00+00:002022-01-21T00:00:00+00:00Daniele Zuricohttps://capgemini.github.io/authors#author-daniele-zurico
<p>At Capgemini, and for most projects, we usually create the same standard basic components for example buttons, form inputs, radio buttons, etc.
These components can be time consuming to set up initially, given we need to implement all the logic like validation, user interaction, tests, documentation, QA validation, AA accessible and fix any bugs raised also.
This step can take a fair bit of time and effort, especially when building solid foundations for our software. It can take up to 3 to 4 sprints for a full team to build a set of high quality common components and this is why in DCX (Digital Customer Experience) we decided to start building a library that will speed up this initial process.</p>
<h2 id="day-0-the-initial-challenge">Day 0: The initial challenge</h2>
<p>When our DCX team (Digital Customer Experience) at Capgemini started building our first library we were under the impression that a component library was of the same effort as creating a web application for a client.
The team were very much used to developing in client-oriented development processes, with a well guided set of requirements usually driven by a business analyst, style guide and colour palette from a UI/UX designer. Conversely, the development process of a library with no pre-defined requirements was an uncharted territory so the first question the team faced was “Where do we start?”.
After a few weeks, the team came to the realisation that the strategy had to be re-evaluated and this blog post will explain the journey thus far and will share the story of the development of the dcx-react-library.</p>
<h2 id="day-1-why-do-we-need-another-library">Day 1: Why do we need another library?</h2>
<p>Sometimes a developer may suggest using bootstrap or react-material for the UI. After all, these libraries already have a pre-built set of components that will speed up our development. The problem using these common libraries is that our clients’ brand and project requirements need the UX to be specific and unique.</p>
<p>We’ve tried it many times before and it hasn’t worked. We really need our own set of base components… no hold on I got an idea… what about if we go on Google and research some components we need and we can import into our project? That should work.</p>
<p>Well, occasionally a third-party library works but after few months you may regret your decision for several reasons, such as:</p>
<ol>
<li>My project has custom requirements the library can’t support. You could create a ticket on the GitHub project in the hope that the owner implements it. Hopefully within a few weeks or a month. Hopefully.</li>
<li>There’s a bug. You create a ticket on the GitHub page, and no one replies to you, all the while your client is pressuring you to resolve the problem that’s impacting all their customers.</li>
<li>You get lucky and the library you found and implemented looks and works great out of the box. But… the component you chose has 50 dependencies and after a while some are obsolete, and others have security vulnerabilities. It’s not long before your project is failing all the health checks.</li>
<li>The component you imported is now deprecated and no longer maintained by the author.</li>
</ol>
<p>I can easily continue listing more reasons, but I think you understand what prompted us to create our own library.</p>
<h2 id="day-2-what-will-our-library-look-like">Day 2: What will our library look like?</h2>
<p>After we all agreed that relying on different libraries is not a sustainable approach, we started to think how we could build a library that can be used by all our clients and what kind of components we need to create.</p>
<p>We didn’t want to repeat the same mistakes and limitations we saw in the other libraries, so we decided to implement our library our way:</p>
<ol>
<li><strong>Style agnostic</strong>: all components are built without styling, so consumers are able to style components as desired based on a set requirements of the clients and UI/UX experts.</li>
<li><strong>Few dependencies</strong>: we don’t want to rely on external dependencies but at the same time we don’t want to reinvent the wheel, so we decided to use few external dependencies (only 5) that each have 0 dependencies themselves.</li>
<li>With <strong>super-powers</strong>: Every component is both flexible and extensible, there are intrinsically built with possible requirement and use cases in mind.</li>
<li><strong>Small bundle</strong>: every possible technique should be used to provide a tiny bundle consisting of only a few kilobytes.</li>
<li><strong>Fully tested</strong>: our library will have 100% code coverage.</li>
<li><strong>AA accessible</strong>: every component is built and tested to be 100% accessible.</li>
</ol>
<h2 id="day-3-how-will-we-add-documentation-to-our-library">Day 3: How will we add documentation to our library?</h2>
<p>I used to Google a lot, looking for the next cool library to use but I’m simply not patient enough to read thousands of lines of code because a library isn’t well documented. I know it’s really boring for most developers to write good documentation but if we want someone else to use our library, we honestly don’t have a choice.</p>
<p>Our documentation contains around 190 stories, is organised by components and each component has 4 main sections:</p>
<ol>
<li>Documentation: explains the aim of the component with a simple example and lists all the available properties while providing a detailed description for all of them.
<img src="/images/2022-01-18-dcx-react-library/dcx-react-library-documentation.jpg" alt="Documentation" /></li>
<li>Un-styled: UI/UX designers may be shocked to see this section because they’re going to see how the component looks naked, without any styles applied.
<img src="/images/2022-01-18-dcx-react-library/dcx-react-library-un-styled.jpg" alt="Unstyled" /></li>
<li>Styled: UI/UX designers will feel better here. In this section we provide an example of how the components look once a style is applied. We also provide the code and CSS that can be copied for use.
<img src="/images/2022-01-18-dcx-react-library/dcx-react-library-styled.jpg" alt="Styled" /></li>
<li>Live: this is the section that we love the most. In this section developers can interact with the code, adding and removing properties to see how the component renders.
<img src="/images/2022-01-18-dcx-react-library/dcx-react-library-live.jpg" alt="Live" /></li>
</ol>
<h2 id="day-4-how-many-components-have-we-built-so-far">Day 4: How many components have we built so far?</h2>
<p>OK, I need to come clean… we didn’t just spend 3 days getting to this point. It’s taken considerably more and at the time of writing this post we’ve released version 0.3.6 into production, and we’ve already started working on version 0.4.
The library has more than 20 components and most of them are being used in some of our client projects.</p>
<p>We look forward to creating more and will continue to listen to our developers and clients, adding more functionality and simplifying the usage of the components.</p>
<p><img src="/images/2022-01-18-dcx-react-library/dcx-react-library-components-list.jpg" alt="Components available" /></p>
<h2 id="whats-next">What’s next?</h2>
<ul>
<li>
<p>If you’d like to know more about the library, feel free to contact <a href="daniele.zurico@capgemini.com">Daniele Zurico</a> or <a href="isaac.babalola@capgemini.com">Isaac Babalola</a>.</p>
</li>
<li>
<p>If you curious to give it a try, it’s publicly available on <a href="https://www.npmjs.com/package/@capgeminiuk/dcx-react-library">npm</a>.</p>
</li>
<li>
<p>If you’d like to contribute, you will need to request access from our <a href="https://github.com/Capgemini/dcx-react-library">private repository</a>.</p>
</li>
<li>
<p>If you want to familiarise with all the components we built, take a look at our <a href="https://main--6069a6f47f4b9f002171f8e1.chromatic.com">storybook documentation</a>.</p>
</li>
</ul>
<p><a href="https://capgemini.github.io/development/dcx-react-library/">Introducing the DCX React Library</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on January 21, 2022.</p>https://capgemini.github.io/culture/software-engineering-gender-gap2021-10-15T00:00:00+01:002021-10-15T00:00:00+01:00Paul Monkhttps://capgemini.github.io/alumni#author-paul-monk
<h2 id="a-brief-history-lesson">A Brief History Lesson</h2>
<p>There is currently a large disparity in the percentage of female vs male Software Engineers. According to <a href="https://www.wisecampaign.org.uk/statistics/2019-workforce-statistics-one-million-women-in-stem-in-the-uk/">Wise</a> women make up just 16.4% of the IT Engineering workforce. In education <a href="https://www.engineeringuk.com/media/1691/gender-disparity-in-engineering.pdf">another study</a> shows that Computing has one of the lowest take-ups by women across all Engineering categories, with just 16% of degree candidates being female.</p>
<p>But it wasn’t always this way. There have been plenty of female role models within software engineering in the past, in fact computers wouldn’t be what they are today without these pioneering women:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Ada_Lovelace">Ada Lovelace</a> (1815 – 1852) - regarded as the first computer programmer</li>
<li><a href="https://en.wikipedia.org/wiki/Grace_Hopper">Grace Hopper</a> (1906 - 1992) - a computer programming pioneer who invented the compiler and theory behind high-level programming languages</li>
<li><a href="https://en.wikipedia.org/wiki/Dorothy_Vaughan">Dorothy Vaughan</a> (1910 - 2008) - NASA’s first African-American manager, taught herself and her staff Fortran</li>
<li><a href="https://en.wikipedia.org/wiki/Mary_Kenneth_Keller">Mary Kenneth Keller</a> (1913 - 1985) - first woman to earn a doctorate in computing in the US, part of the team who created BASIC</li>
<li><a href="https://en.wikipedia.org/wiki/ENIAC#Programmers">The ENIAC Programmers</a> (1945) - 6 women who were the first programmers of the first digital computer</li>
<li><a href="https://en.wikipedia.org/wiki/Carol_Shaw">Carol Shaw</a> (1955 - ) - one of the first female game designers</li>
<li><a href="https://en.wikipedia.org/wiki/Radia_Perlman">Radia Perlman</a> (1958 - ) - invented the spanning-tree protocol, and contributed to many areas of network design and internet routing</li>
</ul>
<p>Computer technology really emerged during World War II, at this time women made up the majority of the engineering workforce. By the 1960s men made up a large majority of all workers. But in software more than one in four programmers were still women. Computing was largely seen as a woman’s job, and women benefitted from a positive stereotype at this time. However there was a large pay gap between women and their male counterparts, according to the book <a href="https://books.google.co.uk/books?id=GWOIXDsLQWwC&printsec=frontcover&dq=Recoding+Gender:+Women%2527s+Changing+Participation+in+Computing&hl=en&sa=X&redir_esc=y#v=onepage&q=salary&f=false">Recoding Gender: Women’s Changing Participation in Computing</a>, in 1969 the median salary for female computer specialists was $7,763, where men earned a median of $11,193 doing the same job! In spite of this, until 1984 the uptake of women in software engineering was increasing. According to the below graph (admittedly from an American source) the percentage of women studying software engineering degrees topped out at 37%, and has been declining ever since.</p>
<p><img src="/images/2021-09-16-software-engineering-gender-gap/women-percentage-over-time.png" alt="Graph showing decline of female degree students since 1984" /></p>
<p>There are many theories for the growing gender disparity. One possible reason is the 1970s recession meant programmers weren’t in demand at this time which could have led to a drop off in female uptake in the field. Another potential reason is the production of personal computers increased the male uptake in software engineering. A 1985 Apple advert showed how much a computer could help boys, and showed a boy teasing a girl who was trying to use a computer. Advertising like this caused stereotypes to shift and computers became perceived as a thing for boys.</p>
<div class="small-12 medium-8 large-4 small-centered columns">
<div class="flex-video">
<iframe width="640" height="360" src="https://www.youtube.com/embed/rxNjx_VWJ8U" frameborder="0" allowfullscreen=""></iframe>
</div>
</div>
<h2 id="how-the-uk-compares">How the UK Compares</h2>
<p>As previously stated, women make up just 16.4% of the IT workforce in the UK. But other countries have a much higher percentage of women in this area. The graph below shows several countries where women make up 25% more of employees in their tech industry:</p>
<p><img src="/images/2021-09-16-software-engineering-gender-gap/women-in-tech-by-country.png" alt="Graph showing percentage of women in tech per country" />
<a href="https://www.europeanwomenintech.com/hs-fs/hubfs/Women%20in%20Tech%20By%20Country%20Graph.png?width=900&name=Women%20in%20Tech%20By%20Country%20Graph.png" class="image-source text-center">Image from: European Women In Tech</a></p>
<p>According to <a href="http://www.unesco.org/new/en/media-services/single-view/news/women_still_a_minority_in_engineering_and_computer_science/">Unesco</a>, 50% of Malaysia’s engineers are women, and for Oman the figure is 53%. These figures highlight that the UK doesn’t do well when it comes to gender diversity in the workplace, and that there is lots of potential for improvement in this area.</p>
<p>At the time of writing, within Capgemini women make up just over 30% of our junior grades but this decreases to less than 20% of our senior grades and roles. At Capgemini in India, women employees constitute over 35% percent of our workforce. Within our Open Source Cloud Engineering team women make up just 15.4%.</p>
<h2 id="women-in-software-today">Women in software today</h2>
<p>There is lots of work going on to improve gender diversity in the workplace. For 5 years in a row Capgemini has been included in the <a href="https://www.capgemini.com/gb-en/2021/04/capgemini-has-been-listed-in-the-times-top-50-employers-for-women-2021/">Times Top 50 Employers for women list</a>. There are also lots of female role models around today, such as:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Parisa_Tabriz">Parisa Tabriz</a> - A Director of Engineering at Google</li>
<li><a href="https://en.wikipedia.org/wiki/Juliana_Rotich">Juliana Rotich</a> - A tech entrepreneur and strategic advisor</li>
<li><a href="https://en.wikipedia.org/wiki/Shafi_Goldwasser">Shafi Goldwasser</a> - A pioneer in cryptography, won the Turing Award in 2012</li>
<li><a href="https://en.wikipedia.org/wiki/Jade_Raymond">Jade Raymond</a> - A video game developer, led the creation of Assassins Creed, founded Ubisoft’s Toronto subsidiary</li>
<li><a href="https://www.client-server.com/blog/2018/02/sara-haider-the-android-developer-inspiring-girls-to-code-international-womens-day-2018">Sara Haider</a> - Engineer for Android apps at Twitter, she was the leader of development for <a href="https://en.wikipedia.org/wiki/Vine_(service)">Vine</a> that had 200 million users at its peak</li>
<li><a href="https://en.wikipedia.org/wiki/Amanda_Wixted">Amanda Wixted</a> - An iOS app developer, led development of <a href="https://en.wikipedia.org/wiki/FarmVille">FarmVille</a>, founder of Meteor Grove Software</li>
<li><a href="https://www.capgemini.com/gb-en/careers/life-at-capgemini/women-at-capgemini-uk-1/be-a-role-model/">Women at Capgemini</a> - Capgemini have many great female engineers</li>
</ul>
<p>There are a lot of stereotypes that are still being applied to women in software engineering, but this is no longer tolerated within mainstream companies. In 2017, a Google employee named James Damore wrote in an internal email about qualities he thought were more commonly found in women. This included higher rates of anxiety, and he assumed this explained why they weren’t thriving in a competitive world of coding. Google fired Damore, however his opinion does reflect what some people within the software industry think, and highlights the stereotypes that are often applied. Most companies though do not allow this kind of behaviour, and the Capgemini software engineering team responded to Damore’s comments with <a href="https://capgemini.github.io/engineering/Capgemini-Engineering-Diversity-Manifesto/">this blog post</a>.</p>
<p>Another issue that is still prevalent is the pay disparity between genders. This is often caused by men holding more senior positions within a company compared to women. The gender pay gap for the UK was 15.5% in 2020 according to <a href="https://www.statista.com/statistics/280710/uk-gender-pay-gap/">Statista</a>, this gap has been narrowing in recent years and is down from 27.5% in 1997.</p>
<h2 id="what-can-be-done">What can be done</h2>
<p>Having researched the statistics of various countries, and worked with many women, I can safely say they are just as capable as men at software engineering. In fact gender diversity benefits businesses in the same way that any diversity does, more ideas and increased flexibility leads to better products! There are many stats that show having a better gender balance increases the profitability of a company, for example <a href="https://www.forbes.com/sites/forbestechcouncil/2020/03/10/top-three-reasons-we-need-more-women-in-tech/?sh=1aee726b15fb">Forbes</a> and <a href="https://www.gallup.com/workplace/236543/business-benefits-gender-diversity.aspx">Gallup</a>. Being more gender diverse leads to a more innovative business, as diverse employees come with different experiences and viewpoints. This increases a businesses ability to solve problems which is key in the software engineering game. According to the 2011 census women make up 51% of the population in England and Wales, that is a lot of potential talent that software companies could be tapping into. At the moment with a male dominated workforce we are only utilising 49% of the talent available. Imagine what could be achieved with 100% of that talent! I personally quite fancy a smartphone that is also a robot, I think <a href="https://www.theguardian.com/technology/2016/apr/14/robohon-worlds-cutest-smartphone-robot-can-be-yours-for-a-hefty-price-tag">this one</a> is in need of some gender diverse talent!</p>
<p>We have a role as engineers to ensure the workplace is a fully inclusive space. There are many stereotypes and unconscious biases around female software engineers that aren’t true and aren’t acceptable within todays society. Everyone has a responsibility to stop any prejudice they encounter. Capgemini does a lots of work in this specific area, and has published an <a href="https://www.capgemini.com/gb-en/our-active-inclusion-strategy/">Active Inclusion Strategy</a> which specifically highlights <a href="https://www.capgemini.com/gb-en/careers/life-at-capgemini/active-inclusion-at-capgemini-1/unconscious-bias/">unconscious bias</a>, and includes a range of materials and information to try an combat any unconscious bias coming from a lack of awareness/information.</p>
<p>The gender diversity gap in software engineering isn’t something that can be solved overnight, but we can all play our part in helping to encourage more women into software engineering careers.</p>
<p><a href="https://capgemini.github.io/culture/software-engineering-gender-gap/">The Software Engineering Gender Gap</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on October 15, 2021.</p>https://capgemini.github.io/architecture/micro-frontends-an-introduction2021-10-08T00:00:00+01:002021-10-08T00:00:00+01:00Lewis Vincehttps://capgemini.github.io/authors#author-lewis-vince
<h2 id="microservices--distributed-systems-in-general">Microservices & Distributed Systems in General</h2>
<p>The growth in popularity of distributed systems is not without reason. Organisations have found that the separation of workstreams along domains / business capabilities has provided teams with a greater degree of product ownership. There is an increase in the overall complexity of the aggregated services of course, as interactions between services have to be agreed (often between multiple teams), but the advantages of independently deployable & scalable components and greater team autonomy should make microservices an attractive option for most projects.</p>
<p>There is a caveat to this however: the term “microservice” is pretty much exclusively used to refer to backend web services. These are services that are usually obscured behind an API gateway and a user interface of some kind. There have been lots of weird and wonderful innovations to improve the way we design and build distributed systems, but that effort is primarily focused around improving <a href="https://en.wikipedia.org/wiki/CAP_theorem">consistency and availability</a> for your backend services (not to mention the countless frameworks, plugins and tools to marry with whatever patterns you decide to go with for your system). This isn’t exactly surprising, if you’re going for as close to perfect as possible for your system’s consistency and availability then you’re going to focus on the services that are actually performing the tasks required of the system.</p>
<h2 id="user--developer-experiences">User & Developer Experiences</h2>
<p>If we move out of the system’s boundary and look at how our users see it from an external view, we’ll find that they don’t (or rather they <strong>really</strong> shouldn’t) see anything that gives away its nature or underlying architecture. Users expect a seamless experience when they interact with an application, aesthetically as well as functionally. If your interface seems to change its design system in places, that’s going to make your system look like a patchwork of different bits of software and the illusion of a single consistent application is broken. If your interface doesn’t have clear and unified paths for performing actions, then your users may not be able to use it and you’ve suddenly got a <strong>very</strong> serious problem.</p>
<p>These risks, combined with the fact that interfaces like web apps are essentially just bundles of static assets that users have access to, has led to the default architectural option being some flavour of monolith for frontend applications. Now I’m sure I don’t have to go through the effort of explaining where monolithic applications fall short, those of you who haven’t had the pleasure of building or maintaining one will undoubtedly have heard tales from those who have. However, it’s worth touching on some techniques that have been used to attempt to distribute work on a monolith between different teams.</p>
<p>A module-based approach can seem like a good idea on first consideration, as you can neatly divide up an application and give ownership of them to various teams. What should also be considered is the aggregation of these modules into the final artifact. At some point, these components are going to have to be bundled together, tested, packaged into a deliverable and deployed. The coupling between these modules is usually <em>tight</em>, which increases the likelihood of changes in one component impacting another. If this is the case, then you’re not making the most of a distributed system since your teams will still have to negotiate when making changes that affect the interfaces between their components.</p>
<p>Now that they’ve been put into the appropriate context, we can <em>finally</em> start talking about micro frontends.</p>
<h2 id="micro-frontends-a-new-alternative">Micro Frontends: a New Alternative</h2>
<p>Let’s start with a (fairly vague) definition:</p>
<blockquote>
<p>A micro frontend is a semi-independent component that can be independently deployed, and dynamically integrated into, a user interface.</p>
</blockquote>
<p>This doesn’t really do much for us, so let’s dig into it a bit. What we’re basically talking about is an extension of the thinking that brought microservices into being - what if we cut up our frontend along some meaningful boundaries and developed them separately? You can split the work between teams and let them build their own components in their own way, which is the first advantage this approach brings. Each team doesn’t have to negotiate with every other team to be able to use their preferred tools, since they own the development and deployment aspects of their product. You could have teams using completely different JS libraries, built using different CI tools and deployed to different platforms, and you can still integrate their work together to form a single seamless frontend. This approach will be appreciated by the teams, as they’re freed from the need to agree with all other teams on what tools they should all use.</p>
<p>Since your teams can all go off and deliver the functionality they’re responsible for, you <strong><em>can</em></strong> also bring in more contributors without impacting developer experience (please note I’m emphasizing the word “can”, as simply throwing bodies haphazardly at a project tends to make things worse instead of better). As long as your frontend is decomposed sensibly, and teams are working together to communicate across their component boundaries properly then you shouldn’t have any problems.</p>
<p><strong><em>NOTE:</em></strong> At this point it’s worth mentioning that micro frontends are still evolving as a concept and there are lots of ways to use them, so take some of this with a pinch of salt. You should apply these methods in a way that works best for your project, but I can speak from experience that the techniques mentioned in this article have worked for my past projects.</p>
<p>If we take a standard web application and start cutting it up, we can see some clear potential micro frontends:</p>
<figure>
<img src="/images/2021-09-08-micro-frontends-an-introduction/die-arbeit-store-page-highlighted.jpg" alt="Image showing highlighted sections of an ecommerce application." class="centered medium-8" />
<figcaption>An ecommerce application, with highlighted sections that can be developed as micro frontends</figcaption>
</figure>
<p>Here we can see that there are micro frontends to be made for the navigation bar at the top of the page, the cart summary popup, and the store item components. In theory you could have each of these developed and owned by different teams, and integrated together into a single application.</p>
<h2 id="integration">Integration</h2>
<p>It’s all well and good to develop parts of an application in isolation, but at some point, you’re going to have to stitch everything together. For microservices this is usually achieved with an API gateway, which serves as the entry point into a system and sometimes takes care of cross-cutting concerns such as authentication pathways. Micro frontends also have a single entry point when a user accesses them, usually referred to as a <strong>container</strong> or <strong>root</strong> component. When the application is loaded up this is what is given to the browser, and within it is the code required to load up other components from different micro frontends.</p>
<p>The difference between micro frontends and module-based monoliths should be made clear here; the container application will <em>dynamically</em> load components into the browser DOM from remote sources when they are needed, the initial bundles of JS and HTML that are returned from the user’s initial request do not contain any components belonging to other micro frontends. Instead of this, the root component will contain URLs to various micro frontend resources that are retrieved separately (these could be hosted at a different address, built by a different team using different technologies, and deployed using a different cloud services provider). This gives teams the advantage of being able to deploy a new version of a micro frontend, and have this new version become immediately available through the root component without having to pull new versions into any other components and build them again. You will still need to make sure that any contracts you have established between components are still being respected of course, but you can use the usual techniques for dealing with breaking changes when required.</p>
<h2 id="state--communication">State & Communication</h2>
<p>Building an application with completely isolated components would be a very straightforward thing to do, but I’m sure we’re all aware that it is rarely a scenario that comes up on actual projects. It’s usually inevitable that components will eventually need to communicate with each other, responding to either the user or another system’s actions. Now one way to tackle this would be by following the <a href="https://www.exclamationlabs.com/blog/the-case-for-unidirectional-data-flow/">unidirectional data flow</a> pattern, utilizing libraries like <a href="https://redux.js.org/">Redux</a> to manage state. This pattern is useful for decoupling UI components from state modifications, but the concept of a central state store presents some problems for us in an application composed of micro frontends.</p>
<p>It would be tempting to keep a state store in the root component and have this passed into micro frontends when they are loaded. This would give each micro frontend access to the same state store, allowing them to share data with each other and update themselves when any changes to the state object are made. The problem with this is that you are adding a dependency on the type of state management library the root component is using, which goes against the idea that each micro frontend should be independent. Micro frontends should be able to differ in their implementation without affecting other components in the final application, so restricting them in this way should be avoided wherever possible.</p>
<p>We can allow each micro frontend to manage its own internal state, but there is still the problem of communication. It could be tempting to define interfaces that are implemented in each micro frontend and made available to any component in the application that needs to use them, but this can create tightly coupled micro frontends which makes things difficult to maintain and augment. It is here that we could get some benefit from thinking about this in a different way.</p>
<p>Instead of thinking about side-effects of actions as something that a component applies to another component through a function call, we can think about an action as an event that is broadcasted via a message broker. By doing this we can keep things loosely coupled, allowing easy integration with other components through subscribing to particular message channels or topics. For example, if we had a shopping application with an “add to cart” button, this button could trigger an event that all interested components (a cart contents component for example) could subscribe to. We can use a number of tools (my current preferred choice being <a href="https://www.npmjs.com/package/postal">Postal.js</a>) to achieve this, adding relevant data (item IDs etc.) into the body of the event if required.</p>
<p>An advantage of this event-based approach is that you can document your APIs to be used by other teams when integrating with your micro frontend. Tools like <a href="https://www.asyncapi.com/">AsyncAPI</a> excel at this, giving you a neat, unified definition of all the channels and events that your component(s) will potentially emit, and what they signify. You can also version your events as you would for event-driven systems, allowing for gradual phasing out of old event processing as your system matures and changes. Your API specification can also include information on the build tools you are currently using to export your micro frontend, and the dependencies that it is set up to share (shared dependency resolution is a feature of a number of frameworks used to integrate micro frontends, and would be very important for teams to see when preparing to integrate with your components).</p>
<h2 id="potential-problems">Potential Problems</h2>
<p>Building a micro frontend-based application has many advantages, but there are also some potential issues that you should be aware of so your team are spared some painful problems in the future.</p>
<ul>
<li>Integration technology alignment: you will have to ensure that all teams working on micro frontends are working with the same technologies for exporting their components, as the root application will need to load them all in. There are various options out there, <a href="https://webpack.js.org/concepts/module-federation/">Webpack’s Module Federation plugin</a> and <a href="https://single-spa.js.org/">single-spa</a> to name a couple. Module Federation is very flexible, allowing you fine-grained controls over things like shared dependencies. However, single-spa is very quick to get set up and comes with very useful run configurations out of the box (to assist with local development and running micro frontends in isolation). I have had success in the past with the Webpack solution, but you should choose the tool that’s going to work best for your team(s).</li>
<li>Performance: as your development teams are given the freedom to choose the libraries that they use in their micro frontends, there is a risk that your application will become bloated. There are patterns and techniques you can introduce to mitigate this, both technical and organisational. If you have dependencies that are used in multiple micro frontends, you can configure whatever tool you’re using for builds to use dependencies that are shared by the root component. For example, if you’re using React in a number of micro frontends then you can add this in the root as a shared dependency in your Webpack or single-spa config, and then any micro frontend that uses React will be able to load without having a duplicate version of React bundled into its own JS files. You can also collaborate between your various teams to select various preferred libraries, which will limit the amount of different dependencies being loaded into the final application.</li>
<li>Debugging: As you can imagine, debugging an application with lots of event-driven micro frontends can get quite complicated, especially when initially integrating new components into an application. If you’re using an event bus like Postal.js, then using a plugin like <a href="https://github.com/postaljs/postal.diagnostics">postal.diagnostics</a> can help you track events passing over the boundaries separating your micro frontends. There can also be some difficulties when working with your build tool to correctly bundle and load micro frontends into an application. This can be very difficult to debug when you’re having loading issues, so I’d advise creating a scaffold with a bare-bones implementation for exporting a micro frontend using something like <a href="https://yeoman.io/">Yeoman</a>. With this you can tailor your generator to suit the functionality that your team may need for preparing a new component for integration, decreasing the possibility of running into integration issues when introducing a new micro frontend into your application.</li>
<li>Styling clashes: since you will be loading components built by different teams into the same DOM, you’ll need to put measures in place to avoid CSS class clashes. This can be solved by agreeing to use prefixes in all your CSS classes, or using something like <a href="https://css-tricks.com/css-modules-part-1-need/">CSS modules</a>.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>Micro frontends have a lot of potential if used correctly. It is possible to build large, complex applications by allowing different teams to own part of the web application, but you should take care before you jump straight into using them. As with most patterns and techniques, preparation is key. As long as your teams are communicating properly, and cross-cutting concerns are being handled in a way that is understood by all involved then you shouldn’t run into any problems (at least none that can’t be solved in a relatively straightforward manner).</p>
<p><a href="https://capgemini.github.io/architecture/micro-frontends-an-introduction/">Micro Frontends: an Introduction</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on October 08, 2021.</p>https://capgemini.github.io/development/developers-hippocratic-oath2021-08-13T00:00:00+01:002021-08-13T00:00:00+01:00Malcolm Younghttps://capgemini.github.io/authors#author-malcolm-young
<p>It may initially seem that doctors and web developers don’t have that much in common, but <a href="https://freakonomics.com/podcast/bad-medicine-part-1-story-98-6/">an episode of Freakonomics Radio looking at the Hippocratic oath</a> a while ago got me thinking about what a general code of good practice for the software industry might look like.</p>
<p>Particularly given the state of the tech industry over the past few years, the idea of members of a profession vowing to uphold a specific code of ethics is one that we could learn from, even in a field that initially seems unrelated. Some people are already considering the need for <a href="https://www.wired.co.uk/article/data-ai-ethics-hippocratic-oath-cathy-o-neil-weapons-of-math-destruction">an ethical framework for data science</a>, and there’s clearly a need for developers to think about more than just the technical aspects of the software that we build.</p>
<p>It isn’t the first time that <a href="https://capgemini.github.io/development/making-ethical-development-choices/">the subject of ethics</a> has come up on this blog, although there’s more to the idea of an oath than ethics. A few years ago, Andrew Harmel-Law and I were trying to write <a href="https://capgemini.github.io/development/how-we-work/">a manifesto for the right way to approach software development and build an engineering culture</a>, and I want to revisit that theme, viewing our profession through the lens of some extracts from <a href="https://en.wikipedia.org/wiki/Hippocratic_Oath">the Hippocratic Oath</a>.</p>
<h2 id="i-will-respect-the-hard-won-scientific-gains-of-those-physicians-in-whose-steps-i-walk-and-gladly-share-such-knowledge-as-is-mine-with-those-who-are-to-follow">“I will respect the hard-won scientific gains of those physicians in whose steps I walk, and gladly share such knowledge as is mine with those who are to follow.”</h2>
<p>As so many of us would testify, we stand on the shoulders of giants, particularly when we build systems using open source software. By using tools and knowledge shared by others, we can achieve things that wouldn’t otherwise be feasible, and by sharing our own tools and knowledge, we can do our bit to keep that virtuous circle going. Within our team, we strongly encourage open source contribution, mentoring and knowledge sharing - the whole point of this blog is to share what we have learned.</p>
<h2 id="i-will-apply-for-the-benefit-of-the-sick-all-measures-that-are-required-avoiding-those-twin-traps-of-overtreatment-and-therapeutic-nihilism">“I will apply, for the benefit of the sick, all measures [that] are required, avoiding those twin traps of overtreatment and therapeutic nihilism.”</h2>
<p>I’m not entirely sure what a software equivalent of therapeutic nihilism might be, but I would definitely draw a parallel between medical overtreatment and the tendency often displayed by engineers for <a href="https://martinfowler.com/bliki/Yagni.html">over-engineering solutions</a>, or being <a href="https://mcfunley.com/choose-boring-technology">distracted by technology for its own sake</a>. There’s often a danger that developers get sucked in by the new toys that all the cool kids are playing with, whether that’s a front end framework or an esoteric language or a build process. As developers, we like to build things, and if we’re not careful, we can find ourselves in thrall to the activity of building itself, rather than stepping back and asking whether we are actually solving a problem that needs to be solved.</p>
<h2 id="i-will-remember-that-there-is-art-to-medicine-as-well-as-science">“I will remember that there is art to medicine as well as science”</h2>
<p>As much as we like to talk about being engineers, what we do isn’t purely cold and logical. Yes, there are aesthetic considerations to code, a pursuit of elegance as well as efficiency, but there’s more to it than that. Sometimes when we’re solving problems, we go by hunches. Code somehow <em>feels</em> right, or <a href="https://martinfowler.com/bliki/CodeSmell.html">smells wrong</a>. But more importantly, we need to communicate. We need to ensure that the next developer who maintains our system can understand it.</p>
<h2 id="warmth-sympathy-and-understanding-may-outweigh-the-surgeons-knife-or-the-chemists-drug">“warmth, sympathy, and understanding may outweigh the surgeon’s knife or the chemist’s drug.”</h2>
<p>Writing code is only a small part of what we do. Something that keeps coming up again and again in our industry is that so-called <a href="https://itsyourturnblog.com/lets-stop-calling-them-soft-skills-9cc27ec09ecb#.lzcyril4q">“soft skills”</a> are as valuable as technical knowledge, not least because <a href="https://www.thoughtworks.com/en-gb/insights/blog/tech-not-problem-people-are">“all Computer Science problems are people problems”</a>. So much of our jobs are about talking to people to understand the problem, rather than jumping into solutions.</p>
<p>Building relationships with clients and other delivery partners is vital - before we can even think about building the thing right, we need to understand the problem area enough to be confident that we’re building the right thing. On top of that, within a team it’s important to build a supportive atmosphere to give colleagues the psychological safety that we all need to do our best work.</p>
<h2 id="i-will-not-be-ashamed-to-say-i-know-not-nor-will-i-fail-to-call-in-my-colleagues-when-the-skills-of-another-are-needed">“I will not be ashamed to say “I know not,” nor will I fail to call in my colleagues when the skills of another are needed”</h2>
<p>When I’m recruiting, I’d much rather hire someone who admits that they don’t know the answer than someone who tries to bluff their way through a difficult question. As we’ve mentioned before on this blog, it’s important to <a href="https://capgemini.github.io/development/its-sometimes-clever-to-admit/">acknowledge our own limitations</a> - I’m proud that we’re not too proud to ask for help, not so arrogant as to think that we know it all. The technologies we work with are so complex, and their use cases are so diverse, that no single person could ever have the breadth and depth of knowledge to be able to answer all possible questions.</p>
<p>On Slack, we have a bot that keeps a karma points ranking - it’s no coincidence that “team” is way out in front, with twice as many points as any individual. When we work together, we can achieve far more than any single person could.</p>
<h2 id="i-will-respect-the-privacy-of-my-patients">“I will respect the privacy of my patients”</h2>
<p>It’s impossible to overstate the importance of keeping user and client data secure on the internet, but beyond that we also need to recognise that our clients trust us to help them solve their problems, and we need to respect that trust.</p>
<h2 id="i-will-remember-that-i-do-not-treat-a-fever-chart-a-cancerous-growth-but-a-sick-human-being-whose-illness-may-affect-the-persons-family-and-economic-stability-my-responsibility-includes-these-related-problems">“I will remember that I do not treat a fever chart, a cancerous growth, but a sick human being, whose illness may affect the person’s family and economic stability. My responsibility includes these related problems”</h2>
<p>Our job as software engineers is not just to build the thing that we’ve been told to, or fix the bug that is described in the ticket we’ve been assigned. We need to <a href="https://capgemini.github.io/agile/bigger-picture-smaller-details/">see the bigger picture</a>, which might mean lots of different things. As well as the code we’re working on, we should consider the system as a whole, whether that’s in terms of security, accessibility, performance, user experience, maintainability, or any number of related considerations.</p>
<p>As well as the system itself, we should bear in mind the wider business goals of our clients, and the society in which it operates. The ticket that one developer works on is part of a sprint, which is part of a project. The project is (or at least should be) part of the overall strategy of the business. The business is part of the wider world. All of these deserve consideration.</p>
<h2 id="i-will-prevent-disease-whenever-i-can-for-prevention-is-preferable-to-cure">“I will prevent disease whenever I can, for prevention is preferable to cure”</h2>
<p>Most developers have a favourite war story about the time they had to pull an all-nighter to figure out some obscure bug or get their project out of some mess. Just as there’s a certain strange glamour attached to epic surgery, there’s a perverse kind of glory associated with these supposedly heroic feats of virtual firefighting. There’s an intensity, a sense of vitality, a buzz, but it’s not healthy. We congratulate ourselves for pulling out all the stops to get ourselves through a crisis, but that crisis should never have been allowed to happen.</p>
<p>What is less exciting, but far more valuable, is making sure that the project never gets into a mess in the first place - vaccinating our project against crisis through solid processes like testing and automation, or the simple expedient of not trying to do too many things too quickly. If we can prevent problems from occurring, we’re all much likelier to enjoy our weekends.</p>
<h2 id="i-will-protect-the-environment-which-sustains-us-in-the-knowledge-that-the-continuing-health-of-ourselves-and-our-societies-is-dependent-on-a-healthy-planet">“I will protect the environment which sustains us, in the knowledge that the continuing health of ourselves and our societies is dependent on a healthy planet.”</h2>
<p>More and more, we need to consider the <a href="https://www.capgemini.com/2021/05/sustainability-cloud-computing-with-microsoft-azure/">environmental impact of technology</a>, and we can no longer ignore the contribution to climate change made by what we do, whether that’s in what we build, the data centres where it’s hosted, or how we go about our business. We should be doing more to aim for sustainability, and let’s not get onto the subject of <a href="https://www.investopedia.com/tech/whats-environmental-impact-cryptocurrency/">blockchain</a>.</p>
<h2 id="i-will-remember-that-i-remain-a-member-of-society-with-special-obligations-to-all-my-fellow-human-beings">“I will remember that I remain a member of society, with special obligations to all my fellow human beings”</h2>
<p>Sometimes there can be <a href="https://xkcd.com/627/">a division between ‘technical people’ and ‘normal people’</a>, a sense that the technology team are somehow separate from ‘the business’. Developers can sometimes be guilty of fostering that sense of division, perhaps by speaking in jargon or misjudging the level of detail that’s appropriate to the audience, or behaving as if considerations of business needs are somehow beneath us.</p>
<p>It’s too easy for developers to <a href="https://idlewords.com/talks/sase_panel.htm">imagine that technology can solve all the problems</a>, or to think that normal rules don’t apply to technology or technologists. We need to remind ourselves that we don’t exist in some separate sphere where we can devote ourselves to the loftier pursuit of solving technical problems, outside of the messy realities of human interaction - like it or not, we’re all in it together.</p>
<h2 id="may-i-always-act-so-as-to-preserve-the-finest-traditions-of-my-calling-and-may-i-long-experience-the-joy-of-healing-those-who-seek-my-help">“May I always act so as to preserve the finest traditions of my calling and may I long experience the joy of healing those who seek my help.”</h2>
<p>It may be overstating things to describe what software engineers do as a calling, but we should regard it as a profession. That doesn’t just mean that we get paid for it, but that working with software should be <a href="https://en.wikipedia.org/wiki/Profession">“a vocation founded upon specialized educational training, the purpose of which is to supply disinterested objective counsel and service to others”</a>. We should maintain professional standards, and we should take our responsibilities seriously.</p>
<p>That’s not to say that there should be high barriers to entry - speaking as <a href="https://red-route.org/articles/how-i-got-here-there">someone who evolved into a web developer</a>, one of the things I like about building websites is that you can start (and continue) by tinkering. The technology world as a whole, and the open source software movement in particular, needs enthusiastic amateurs, people without formal training, people who are coming at things from a different direction - people who remain members of society.</p>
<p>The tech industry may not have been around long enough to have such well-established traditions as medicine, but it’s vital that we don’t get distracted by the new and shiny, or neglect the fundamentals. Just because computers are more powerful than they used to be, it doesn’t mean that we can ignore the basic principles of efficient computation. Web developers building things with JavaScript frameworks need to be aware of the basics of semantic HTML and accessibility.</p>
<p>Solving problems can be fun. It’s satisfying not just because it feels good to help people, but it also brings its own rewards. Being able to solve problems with technology brings with it some power, and with power comes responsibility, which brings us back to the need for some principles to work by.</p>
<p>So having looked at some parallels with the medical oath, perhaps this could be a first iteration of a new oath for software engineers:</p>
<h2 id="a-developers-oath">A developer’s oath</h2>
<p>I swear to fulfil, to the best of my ability and judgment, this covenant:</p>
<p>I will respect the hard-won scientific gains of those developers in whose steps I walk, and gladly share such knowledge as is mine with those who are to follow.</p>
<p>I will apply, for the benefit of the project, all measures that are required, avoiding those twin traps of over-engineering and therapeutic nihilism.</p>
<p>I will remember that there is art to code as well as science, and that readability, clarity, and maintainability may outweigh brevity and complexity.</p>
<p>I will write appropriate comments in my code to help the next developer who maintains it to understand why the system has been built the way it has.</p>
<p>I will not be ashamed to say “I know not”, nor will I fail to call in my colleagues when the skills of another are needed for a project’s progress.</p>
<p>I will not take code review comments personally, and I will use code review as an opportunity to support and help my colleagues.</p>
<p>I will respect the privacy of my clients and users of my software, for their problems are not disclosed to me that the world may know. Most especially must I tread with care in matters of personally identifiable information.</p>
<p>I will remember that code should always be in service of solving a problem, and will not apply new technology just because it is interesting to me.</p>
<p>I will consider the needs of users ahead of developer convenience and considerations of theoretical purity.</p>
<p>I do not build a method or an object, but a system, which may affect the client’s employees, users, and economic stability. My responsibility includes these related problems, if I am to care adequately for the system.</p>
<p>I will prevent problems whenever I can, for prevention is preferable to cure.</p>
<p>I will protect the environment which sustains us, in the knowledge that the continuing health of ourselves and our societies is dependent on a healthy planet.</p>
<p>I will remember that I remain a member of society, with special obligations to all my fellow human beings, not just the technologically privileged but also those who have less expensive devices and those who find computers difficult to use.</p>
<p>If I do not violate this oath, may I enjoy life and art, respected while I live and remembered with affection thereafter. May I always act so as to preserve the finest traditions of my calling and may I long experience the joy of solving interesting problems.</p>
<p><a href="https://capgemini.github.io/development/developers-hippocratic-oath/">A Hippocratic Oath for Software Engineers</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on August 13, 2021.</p>https://capgemini.github.io/architecture/enterprise-architecture-docops2021-06-17T00:00:00+01:002021-06-17T00:00:00+01:00Riccardo Freschihttps://capgemini.github.io/alumni#author-riccardo-freschi
<h2 id="introduction">Introduction</h2>
<blockquote>
<p>The purpose of Enterprise Architecture is to optimize across the enterprise the often fragmented legacy of processes (both manual and automated) into an integrated environment that is responsive to change and supportive of the delivery of the business strategy.
<cite><a href="https://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html">The Open Group Architecture Framework, Part I: Introduction</a></cite></p>
</blockquote>
<p>One of the tasks of an Enterprise Architect (EA) is to produce a set of artifacts that collectively form the architecture documentation. Such documentation constitutes the main input to Architecture Governance (AG).</p>
<blockquote>
<p>Architecture Governance is an approach, a series of processes, a cultural orientation, and set of owned responsibilities that ensure the integrity and effectiveness of the organization’s architectures.
<cite><a href="https://pubs.opengroup.org/architecture/togaf9-doc/arch/chap44.html#tag_44_02_01_01">The Open Group Architecture Framework, Part VI: Architecture Capability Framework</a></cite></p>
</blockquote>
<p>In the absence of general consensus on architecture documentation’s format, structure and content, a number of problems have made the subject somehow vexed within the EA’s and software development communities, leading to well known issues like artifacts that are:</p>
<ul>
<li>outdated, created some time in the past, the original authors unknown or moved on</li>
<li>crafted without clear purpose, by different people with insufficient coordination</li>
<li>overwhelming in volume, made of information scattered in various files or wiki pages without structure</li>
<li>painful to write and maintain because goals and tools to use are undefined or unclear</li>
</ul>
<p>Special discomfort is provided by diagram maintenance, where energy is spent during inception phase to create shapes, connections, fitting layout…, which soon enough becomes wasted, as new updates are required and a lot of effort needs to be spent again to get the artifact to a reasonable level of quality. Not to mention versioning: the delta between different versions of a diagram is not immediately visible and can be partially missed if not inspected with care.</p>
<p>On top of that, in the context of governance, documentation review and approval processes vary a lot, often not fully defined and lacking formalities, e.g. based on files exchanged and reviews happening via email.</p>
<h2 id="doctoolchain">docToolchain</h2>
<p><a href="https://www.writethedocs.org/guide/docs-as-code/">Documentation as Code</a> (<em>docs-as-code</em>) refers to the philosophy advising authoring documentation with the same tools as code.
Text-based version control systems, like <a href="https://git-scm.com/book/en/v2/Getting-Started-About-Version-Control">Git</a>, simplify and bring formality to review and approval processes via <a href="https://www.atlassian.com/git/tutorials/comparing-workflows">workflows</a> and <a href="https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests">pull requests</a>, enabling then easier and clearer paths to documentation finalisation.</p>
<p><a href="https://docs.asciidoctor.org/asciidoc/latest/">AsciiDoc</a> is a markup language, embracing the <em>docs-as-code</em> approach and primarily conceived to write technical documentation. Thanks to an AsciiDoc processor, like <a href="https://docs.asciidoctor.org/asciidoctor/latest/#what-is-asciidoctor">Asciidoctor</a> the language produces a variety of output formats, such as HTML and PDF.</p>
<p>Similarly to <em>docs-as-code</em>, <em>diagrams-as-code</em> adopts text to represent and produce diagrams, with similar benefits.</p>
<p><a href="https://asciidoctor.org/docs/asciidoctor-diagram/">Asciidoctor Diagram</a> is a set of Asciidoctor extensions that empower the author to add text-described diagrams to an AsciiDoc document.
Each extension runs the diagram processor to generate an SVG, PNG, or TXT file from the input text. The generated file is then inserted into the converted document.
There are nearly thirty diagram generators supported and amongst them the popular <a href="https://plantuml.com/">PlantUML</a>, which facilitates designing a number of diagram types, e.g.: sequence, class, state, timing, JSON, YAML, Gantt and many others. A few rendering examples below (source code in <a href="#appendix-plantuml-diagrams-source-code">Appendix</a>):</p>
<figure>
<img src="/images/2021-06-07-enterprise-architecture-docops/client-credentials-flow.png" alt="Client Credentials Flow" class="centered medium-8" />
<figcaption>Sequence type</figcaption>
</figure>
<figure>
<img src="/images/2021-06-07-enterprise-architecture-docops/class-diagram.png" alt="Animals" class="centered medium-8" />
<figcaption>Class type</figcaption>
</figure>
<figure>
<img src="/images/2021-06-07-enterprise-architecture-docops/jk-rowling-json.png" alt="J.R. Rowling JSON" class="centered medium-8" />
<figcaption>JSON type</figcaption>
</figure>
<p><a href="https://arc42.org/">arc42</a> is a template for documenting software and system architectures, whose golden master is formatted in <a href="https://docs.asciidoctor.org/asciidoc/latest/">AsciiDoc</a> and which is <a href="https://arc42.org/download">publicly available</a>.
It is segmented into twelve sections, each containing help, divided into contents, motivation and form. <a href="https://arc42.org/examples">Real-world examples</a> are also available.</p>
<p>Putting it all together is <a href="https://doctoolchain.github.io/docToolchain/">docToolchain</a>, a set of scripts that automate the steps of exporting AsciiDoc documents (including arc42-based ones) and rendering diagrams, all to the chosen target format (e.g. HTML).</p>
<h2 id="docops-with-azure-devops">DocOps with Azure DevOps</h2>
<p><a href="https://www.writethedocs.org/guide/doc-ops/">DocOps</a> applies to the creation, management, and release of documentation, similarly to those applied to source code by <a href="https://www.atlassian.com/devops">DevOps</a>: it is a set of practices automating and integrating the process of developing documentation across engineering, product, support, and technical writing teams.</p>
<p>Starting with docToolchain, in order to fully comply with the DocOps approach, the missing piece is the automatic deployment of the documentation, once the approval is granted (in the form of an approved pull request).</p>
<p>To cover that “last mile”, I decided to host the documentation on Azure DevOps Repos and implement a pipeline in Azure DevOps Pipelines that, after being triggered by an update to the documentation repository master branch, builds an HTML page with text and diagrams and deploys to a website.
For the purpose of this exercise I decided to use a <a href="https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-static-website">static website hosted in Azure Storage</a> as the destination (a lot of other types are available, selection depending on requirements).</p>
<p><img src="/images/2021-06-07-enterprise-architecture-docops/docops.png" alt="DocOps pipeline on Azure DevOps" class="centered medium-12" /></p>
<p>(The PlantUML diagram definition for the above is in <a href="#appendix-plantuml-diagrams-source-code">Appendix</a>.)</p>
<p>Here is the definition of the pipeline in YAML:</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">variables</span><span class="pi">:</span>
<span class="na">documentationRoot</span><span class="pi">:</span> <span class="c1">#subpath to the folder where arc42-template.adoc is located,</span>
<span class="c1">#e.g. after arc42 template download and unzip: arc42-template-EN-withhelp-asciidoc</span>
<span class="na">resources</span><span class="pi">:</span>
<span class="na">repositories</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">repository</span><span class="pi">:</span> <span class="s">docToolchain</span>
<span class="na">type</span><span class="pi">:</span> <span class="s">github</span>
<span class="na">endpoint</span><span class="pi">:</span> <span class="c1">#name of the Azure DevOps connection to github</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">docToolchain/docToolchain</span>
<span class="na">trigger</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">master</span>
<span class="na">pool</span><span class="pi">:</span>
<span class="na">vmImage</span><span class="pi">:</span> <span class="s">ubuntu-latest</span>
<span class="na">steps</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">checkout</span><span class="pi">:</span> <span class="s">self</span>
<span class="pi">-</span> <span class="na">checkout</span><span class="pi">:</span> <span class="s">docToolchain</span>
<span class="na">submodules</span><span class="pi">:</span> <span class="s">recursive</span>
<span class="pi">-</span> <span class="na">script</span><span class="pi">:</span> <span class="s">sudo apt install graphviz</span> <span class="c1">#dependency of some PlantUML diagram types</span>
<span class="na">displayName</span><span class="pi">:</span> <span class="s1">'</span><span class="s">Install</span><span class="nv"> </span><span class="s">graphviz'</span>
<span class="pi">-</span> <span class="na">script</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">cd docToolchain</span>
<span class="s">rm -rf .git</span>
<span class="s">rm -rf resources/asciidoctor-reveal.js/.git</span>
<span class="s">rm -rf resources/reveal.js/.git</span>
<span class="s">./gradlew -b init.gradle initExisting -PnewDocDir="$(Build.SourcesDirectory)/$(documentationRoot)"</span>
<span class="s">./bin/doctoolchain "$(Build.SourcesDirectory)/$(documentationRoot)" generateHTML</span>
<span class="na">displayName</span><span class="pi">:</span> <span class="s">Generate HTML</span>
<span class="pi">-</span> <span class="na">task</span><span class="pi">:</span> <span class="s">AzureCLI@1</span>
<span class="na">displayName</span><span class="pi">:</span> <span class="s">Azure File Copy to Storage</span>
<span class="na">inputs</span><span class="pi">:</span>
<span class="na">azureSubscription</span><span class="pi">:</span> <span class="c1">#name of the Azure DevOps connection to the Azure Subscription </span>
<span class="c1">#where the storage container is hosted</span>
<span class="na">scriptLocation</span><span class="pi">:</span> <span class="s">inlineScript</span>
<span class="na">inlineScript</span><span class="pi">:</span> <span class="pi">|</span>
<span class="s">az storage blob upload-batch \</span>
<span class="s">--destination \$web \</span>
<span class="s">--account-name "myStorageAccount" \</span>
<span class="s">--source "$(Build.SourcesDirectory)/$(documentationRoot)/build/html5"</span></code></pre></figure>
<h2 id="conclusion">Conclusion</h2>
<p>I like working with AsciiDoc for personal and shared documentation: advantages over <a href="https://daringfireball.net/projects/markdown/">Markdown</a>, the more widely adopted alternative, are extensibility and a unique flavour, compared to <a href="https://github.com/commonmark/commonmark-spec/wiki/Markdown-Flavors">the many</a> existing for Markdown, which make the language difficult to port between environments. I also appreciate the large range of diagramming tools supported.</p>
<p>I believe <em>diagrams-as-code</em> is a handy approach for versioning images together with the document and enabling exact and fast delta highlighting. The default styling, especially in PlantUML, is quite basic, but can be customised, if needed.
Adopting the default style though, presents the advantage of consistency between documents and authors, which gives the reader faster grasping of the concepts depicted in the diagrams.</p>
<p>I found arc42 a good starting point as a template, though I feel it is missing the next level of detail, e.g.: sections related to the Logical Information Model (objects, stores and flows) and a Security view.</p>
<p>Finally, I particularly appreciate the workflow achieved with text-only documentation, source control and CI/CD: the easy collaboration, the approval process, the clear versioning, the history tracking that comes with it, the evidence of which version is current, and the auto-deployment.</p>
<h2 id="appendix-plantuml-diagrams-source-code">Appendix: PlantUML diagrams source code</h2>
<h3 id="sequence-type">Sequence type</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@startuml
actor User
User -> "Authentication Provider" as Auth : Authenticate with Client ID + Client Secret
activate Auth
Auth -> Auth : Validate Client ID + Client Secret
Auth --> User : Access Token
User -> "Web Service" as Service : Request Data with Access Token
Service --> User : Response
@enduml
</code></pre></div></div>
<h3 id="class-type">Class type</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@startuml
class Animal {
age
gender
isMammal()
+feed()
}
class Duck {
featherColour
beak
+swim()
+quack()
}
class Lion {
maneColour
+roar()
+chase()
}
class Beak {
colour
length
+open()
+close()
}
Animal <|- Duck
Animal <|-- Lion
Duck *- Beak
@enduml
</code></pre></div></div>
<h3 id="json-type">JSON type</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@startjson
{
"name":"J. K. Rowling",
"born":"31 July 1965",
"genre":"Fantasy",
"country":"United Kingdom",
"occupation":"author",
"books":[
{
"title":"Harry Potter and the Philosopher's Stone",
"yearPublished":1997,
"pages":223
},
{
"title":"Harry Potter and the Chamber of Secrets",
"yearPublished":1998,
"pages":251
}
]
}
@endjson
</code></pre></div></div>
<h3 id="docops-workflow">DocOps workflow</h3>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@startuml
!define AzurePuml https://raw.githubusercontent.com/plantuml-stdlib/Azure-PlantUML/master/dist
!includeurl AzurePuml/AzureCommon.puml
!includeurl AzurePuml/Storage/AzureBlobStorage.puml
!includeurl AzurePuml/Identity/AzureActiveDirectory.puml
!includeurl AzurePuml/DevOps/AzurePipelines.puml
!includeurl AzurePuml/DevOps/AzureRepos.puml
!includeurl AzurePuml/General/Azure.puml
skinparam linetype polyline
skinparam linetype ortho
file "arc42 template" as arc42
Azure(azure, "Cloud computing", "Subscription", ) {
AzureBlobStorage(blob, "Storage Account", "$web Container",) {
file HTML
}
AzurePipelines(pipeline, "Build + release", "CI/CD",) {
[GraphViz]
component docToolchain {
[Asciidoctor]
component "Asciidoctor Diagram" {
[PlantUML processor] as PlantUML
[Other processors, e.g. Mermaid] as otherProcessors
}
[Other generators, e.g. PDF]
[HTML generator] as HTMLgenerator
}
}
AzureRepos(repo, "Architecture documentation", "Repository", ) {
file "Diagrams, format: e.g. PlantUML"
file "Text documents, format: AsciiDoc"
}
AzureActiveDirectory(ad, "Authentication + authorisation", "AD",)
}
repo ---> docToolchain
arc42 <|--- repo
HTMLgenerator --> HTML
GraphViz -- PlantUML: < dependency
otherProcessors -[hidden]-> blob
@enduml
</code></pre></div></div>
<p><a href="https://capgemini.github.io/architecture/enterprise-architecture-docops/">Enterprise Architecture DocOps</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on June 17, 2021.</p>https://capgemini.github.io/engineering/improving-system-performance-with-redis2021-05-25T00:00:00+01:002021-05-25T00:00:00+01:00Riccardo Freschihttps://capgemini.github.io/alumni#author-riccardo-freschi
<h2 id="introduction">Introduction</h2>
<p><a href="https://redis.io/">Redis</a> (REmote DIctionary Server) is an open source in-memory <em>data structures server</em>. Similarly to <em>key-value</em> stores, like <a href="https://memcached.org/">Memcached</a>, data on Redis is held in key-value pairs. Differently though, while in key-value stores both the key and the value are strings, in Redis the key can be any binary sequence, like a string, but also a digest (e.g. the output of a SHA-2 function) and the value can be of <a href="https://redis.io/topics/data-types-intro">different types</a>, among them:</p>
<ul>
<li>Lists, e.g.:</li>
</ul>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">key</span> <span class="o">=</span> <span class="s">"shapes"</span>
<span class="n">value</span> <span class="o">=</span> <span class="p">[</span><span class="s">"square"</span><span class="p">,</span> <span class="s">"triangle"</span><span class="p">,</span> <span class="s">"triangle"</span><span class="p">]</span> <span class="c1"># duplicates allowed</span></code></pre></figure>
<ul>
<li>Sets, e.g.:</li>
</ul>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">key</span> <span class="o">=</span> <span class="s">"shapes"</span>
<span class="n">value</span> <span class="o">=</span> <span class="p">[</span><span class="s">"square"</span><span class="p">,</span> <span class="s">"triangle"</span><span class="p">]</span></code></pre></figure>
<ul>
<li>Hashes, e.g.:</li>
</ul>
<figure class="highlight"><pre><code class="language-python" data-lang="python"><span class="n">key</span> <span class="o">=</span> <span class="s">"shapes"</span>
<span class="n">value</span> <span class="o">=</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"square"</span><span class="p">,</span>
<span class="s">"sides"</span><span class="p">:</span> <span class="mi">4</span>
<span class="p">}</span></code></pre></figure>
<ul>
<li>Bit arrays/bitmaps, e.g.:</li>
</ul>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>10000001010010 # up to 2^32 different bits
</code></pre></div></div>
<p>Because values are typed, Redis is capable of manipulating the content accordingly, e.g.: prepending or appending new elements to a list, computing the intersection between two sets, replacing single elements in a hash, etc.</p>
<p><a href="https://docs.microsoft.com/en-us/azure/azure-cache-for-redis/cache-overview">Azure Cache for Redis</a> is a managed in-memory data store service, based on Redis, offered by Microsoft Azure.</p>
<p>I have recently had the opportunity to work on an enterprise software system, featuring a traditional <a href="https://en.wikipedia.org/wiki/Multitier_architecture">3-tier architecture</a>, fully hosted in Azure, made of the following components:</p>
<ul>
<li>web and mobile application clients</li>
<li>mid-tier backed by a Node.js application, leveraging the Express framework and <a href="http://azure.github.io/azure-mobile-apps-node">Azure Mobile Apps SDK</a></li>
<li>data layer, backed by a General Purpose, Gen5, 2 vCores Azure SQL database</li>
</ul>
<p>Despite the optimisations performed on tables and queries, the latency of client calls hitting a few specific tables was still considered too high. Hence, adopting the <a href="https://docs.microsoft.com/en-us/azure/architecture/patterns/cache-aside">cache-aside pattern</a> leveraging Azure Cache for Redis was explored as a possible solution and the original vs. new behaviour profiled.</p>
<h2 id="creating-the-cache-in-azure">Creating the cache in Azure</h2>
<p>Setting up Redis in Azure is pretty straightforward, as detailed in the <a href="https://docs.microsoft.com/en-us/azure/azure-cache-for-redis/cache-nodejs-get-started">quickstart documentation</a>.
After completion, host name, ports, and access keys are available to be used by external applications for connection.
The tier selected for the development and benchmarking configuration was <a href="https://azure.microsoft.com/en-gb/pricing/details/cache/">Basic C0</a>.</p>
<h2 id="changes-to-the-mid-tier-source-code">Changes to the mid-tier source code</h2>
<p>The basic Azure Mobile Apps table controller is as per the <a href="https://github.com/Azure/azure-mobile-apps-node/blob/master/samples/todo/tables/TodoItem.js">samples on GitHub</a>.
To support Redis, the relevant <a href="https://www.npmjs.com/package/redis">npm package</a> was added and the following changes were applied to the table controllers:</p>
<h3 id="cache-servicejs"><em>cache-service.js</em></h3>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">var</span> <span class="nx">redis</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="dl">"</span><span class="s2">redis</span><span class="dl">"</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">cacheConnection</span> <span class="o">=</span> <span class="nx">module</span><span class="p">.</span><span class="nx">exports</span> <span class="o">=</span> <span class="nx">redis</span><span class="p">.</span><span class="nx">createClient</span><span class="p">(</span><span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">REDISPORT</span><span class="p">,</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">REDISCACHEHOSTNAME</span><span class="p">,</span>
<span class="p">{</span><span class="na">auth_pass</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">REDISCACHEKEY</span><span class="p">,</span> <span class="na">tls</span><span class="p">:</span> <span class="p">{</span><span class="na">servername</span><span class="p">:</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">REDISCACHEHOSTNAME</span><span class="p">}});</span></code></pre></figure>
<h3 id="tablejs"><em>table.js</em></h3>
<figure class="highlight"><pre><code class="language-javascript" data-lang="javascript"><span class="kd">const</span> <span class="nx">cacheConnection</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">../cache-service.js</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">var</span> <span class="nx">table</span> <span class="o">=</span> <span class="nx">module</span><span class="p">.</span><span class="nx">exports</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">azure-mobile-apps</span><span class="dl">'</span><span class="p">).</span><span class="nx">table</span><span class="p">();</span>
<span class="nx">table</span><span class="p">.</span><span class="nx">read</span><span class="p">(</span><span class="kd">function</span> <span class="p">(</span><span class="nx">context</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="k">new</span> <span class="nb">Promise</span><span class="p">(</span> <span class="p">(</span><span class="nx">resolve</span><span class="p">,</span><span class="nx">reject</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">let</span> <span class="nx">url</span> <span class="o">=</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">(</span><span class="nx">context</span><span class="p">.</span><span class="nx">req</span><span class="p">.</span><span class="nx">originalUrl</span><span class="p">);</span>
<span class="nx">cacheConnection</span><span class="p">.</span><span class="kd">get</span><span class="p">(</span><span class="nx">url</span><span class="p">,</span> <span class="p">(</span><span class="nx">err</span><span class="p">,</span> <span class="nx">cachedResults</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">reject</span><span class="p">(</span><span class="nx">err</span><span class="p">);</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="nx">cachedResults</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">resolve</span><span class="p">(</span><span class="nx">JSON</span><span class="p">.</span><span class="nx">parse</span><span class="p">(</span><span class="nx">cachedResults</span><span class="p">));</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="nx">context</span><span class="p">.</span><span class="nx">execute</span><span class="p">().</span><span class="nx">then</span><span class="p">(</span><span class="nx">sqlResults</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">cacheConnection</span><span class="p">.</span><span class="nx">setex</span><span class="p">(</span><span class="nx">url</span><span class="p">,</span> <span class="nx">process</span><span class="p">.</span><span class="nx">env</span><span class="p">.</span><span class="nx">REDISCACHEEXPIRY</span><span class="p">,</span> <span class="nx">JSON</span><span class="p">.</span><span class="nx">stringify</span><span class="p">(</span><span class="nx">sqlResults</span><span class="p">));</span>
<span class="nx">resolve</span><span class="p">(</span><span class="nx">sqlResults</span><span class="p">);</span>
<span class="p">}).</span><span class="k">catch</span><span class="p">(</span><span class="nx">error</span> <span class="o">=></span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">error</span><span class="p">(</span><span class="nx">error</span><span class="p">);</span>
<span class="nx">reject</span><span class="p">(</span><span class="nx">err</span><span class="p">);</span>
<span class="p">})</span>
<span class="p">}</span>
<span class="p">})</span>
<span class="p">})</span>
<span class="p">})</span></code></pre></figure>
<p>Upon request reception, the related path is serialised into a string, which is used as a key in Redis: the key is checked for existence in the cache and, in case of a <em>hit</em>, returned to the client. Otherwise, in case of a <em>miss</em>, the database is queried, the result returned to the client and the key-value pair inserted into the cache.</p>
<h2 id="benchmarking-setup">Benchmarking setup</h2>
<p>The data held in the tables of interest exhibits a synchronous, low frequency update, which allows the setting of a long time to live for the related cache keys (6 hours).</p>
<p>To benchmark the solution including Redis vs. the original one, the keys/requests were extracted from the cache, close to the end of the expiration window, to maximise the representativeness of the sample. The requests were then utilised as input to the profiling <a href="https://httpd.apache.org/docs/2.4/programs/ab.html">Apache Benchmark</a> (AB) tool, which issues requests against the given endpoints and reports on timings.</p>
<h3 id="redis-cli">redis-cli</h3>
<p>The tool used to extract the keys is the <a href="https://redis.io/topics/rediscli">redis-cli</a>. Given the limited support from the Azure Console, to run commands against the Redis service, a <a href="https://hub.docker.com/_/redis">Redis Docker image</a> was started locally and the relevant redis-cli commands were executed to connect to the remote cache and download the keys:</p>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="nv">$ </span>docker pull redis
<span class="nv">$ </span>docker run <span class="nt">-d</span> <span class="nt">-p</span> 6379:6379 <span class="nt">--name</span> redis1 redis
<span class="nv">$ </span>docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fabee04737a9 redis <span class="s2">"docker-entrypoint.s…"</span> About a minute ago Up About a minute 0.0.0.0:6379->6379/tcp redis1
<span class="nv">$ </span><span class="nb">echo</span> <span class="s2">"keys *"</span> | redis-cli <span class="nt">-h</span> <span class="o">{</span>server name<span class="o">}</span>.redis.cache.windows.net <span class="nt">-p</span> 6379 <span class="nt">-a</span> <span class="o">{</span>access key<span class="o">}</span> <span class="o">></span> <span class="o">{</span>file name<span class="o">}</span></code></pre></figure>
<h3 id="apache-benchmark">Apache Benchmark</h3>
<p>AB is the tool chosen for profiling the responsiveness of the server. It simulates a client behaviour by exercising a server’s HTTP endpoints. Between the most notable ones are:</p>
<ul>
<li><em>n</em> : the number of requests</li>
<li><em>c</em> : the degree of parallelism (the number of multiple requests to perform at a time, the default is one)</li>
</ul>
<p>The forecasted maximum number of concurrent users in the system under examination is 20.</p>
<p>A sample report looks like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
Concurrency Level: 1
Time taken for tests: 14.522 seconds
Complete requests: 10
Failed requests: 0
Total transferred: 1241440 bytes
HTML transferred: 1231310 bytes
Requests per second: 0.69 [#/sec] (mean)
Time per request: 1452.156 [ms] (mean)
Time per request: 1452.156 [ms] (mean, across all concurrent requests)
Transfer rate: 83.49 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 140 164 29.6 161 244
Processing: 1109 1288 169.8 1280 1685
Waiting: 968 1154 167.7 1167 1538
Total: 1271 1452 175.7 1442 1851
Percentage of the requests served within a certain time (ms)
50% 1442
66% 1455
75% 1558
80% 1588
90% 1851
95% 1851
98% 1851
99% 1851
100% 1851 (longest request)
</code></pre></div></div>
<h3 id="the-script">The script</h3>
<p>Leveraging AB, for both the with/without Redis scenarios, a bash script was implemented to:</p>
<ul>
<li>retrieve the keys/urls</li>
<li>loop through those</li>
<li>target each one, running the AB command</li>
<li>store the output in a file</li>
<li>parse the output to extract the mean and standard deviation of the latency</li>
</ul>
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="c">#!/bin/bash</span>
<span class="nv">redis_keys_file</span><span class="o">=</span> <span class="c"># the file listing the redis keys (urls/endpoints)</span>
<span class="nv">total_ab_requests</span><span class="o">=</span> <span class="c"># total number of requests per url/endpoint</span>
<span class="nv">parallel_ab_requests</span><span class="o">=</span> <span class="c"># number of requests to run in parallel</span>
<span class="nv">redis_server</span><span class="o">=</span> <span class="c"># redis server</span>
<span class="nv">redis_port</span><span class="o">=</span> <span class="c"># redis port</span>
<span class="nv">access_key</span><span class="o">=</span> <span class="c"># access key</span>
<span class="nv">node_server</span><span class="o">=</span> <span class="c"># node server</span>
<span class="nv">with_redis_table_name</span><span class="o">=</span> <span class="c"># table name in case of solution with Redis</span>
<span class="nv">without_redis_table_name</span><span class="o">=</span> <span class="c"># table name in case of solution without Redis</span>
<span class="nv">ab_output_with_redis</span><span class="o">=</span><span class="s1">'./ab_output_with_redis.txt'</span>
<span class="nv">chart_data_with_redis</span><span class="o">=</span><span class="s1">'./chart_data_with_redis.txt'</span>
<span class="nv">ab_output_without_redis</span><span class="o">=</span><span class="s1">'./ab_output_without_redis.txt'</span>
<span class="nv">chart_data_without_redis</span><span class="o">=</span><span class="s1">'./chart_data_without_redis.txt'</span>
<span class="o">[</span> <span class="nt">-e</span> <span class="nv">$ab_output_with_redis</span> <span class="o">]</span> <span class="o">&&</span> <span class="nb">rm</span> <span class="nv">$ab_output_with_redis</span>
<span class="o">[</span> <span class="nt">-e</span> <span class="nv">$ab_output_without_redis</span> <span class="o">]</span> <span class="o">&&</span> <span class="nb">rm</span> <span class="nv">$ab_output_without_redis</span>
<span class="nb">echo</span> <span class="s2">"keys *"</span> | redis-cli <span class="nt">-h</span> <span class="k">${</span><span class="nv">redis_server</span><span class="k">}</span>.redis.cache.windows.net <span class="nt">-p</span> <span class="nv">$redis_port</span> <span class="nt">-a</span> <span class="nv">$access_key</span> <span class="o">></span> <span class="nv">$redis_keys_file</span>
<span class="nv">total_lines</span><span class="o">=</span><span class="si">$(</span><span class="nb">wc</span> <span class="nt">-l</span> < <span class="nv">$redis_keys_file</span><span class="si">)</span>
<span class="k">while </span><span class="nb">read </span>line<span class="p">;</span> <span class="k">do
</span><span class="nv">host</span><span class="o">=</span><span class="s2">"https://</span><span class="k">${</span><span class="nv">node_server</span><span class="k">}</span><span class="s2">.azurewebsites.net"</span>
<span class="nv">path_no_quotes</span><span class="o">=</span><span class="k">${</span><span class="nv">line</span><span class="p">//\</span><span class="s2">"/} # remove quotes
path_decoded_url=</span><span class="si">$(</span><span class="nb">echo</span> <span class="nt">-e</span> <span class="k">${</span><span class="nv">path_no_quotes</span><span class="p">//%/\\x</span><span class="k">}</span><span class="si">)</span><span class="s2"> # decode url
path=</span><span class="k">${</span><span class="nv">path_decoded_url</span><span class="p">// /+</span><span class="k">}</span><span class="s2"> # replace spaces with '+'
path_without_redis=</span><span class="k">${</span><span class="nv">path</span><span class="p">//</span><span class="nv">$with_redis_table_name</span><span class="p">/</span><span class="nv">$without_redis_table_name</span><span class="k">}</span><span class="s2"> # replace Redis endpoint with non-Redis endpoint, the backing database table remains the same in both cases
ab -n </span><span class="nv">$total_ab_requests</span><span class="s2"> -c </span><span class="nv">$parallel_ab_requests</span><span class="s2"> -H 'zumo-api-version: 2.0.0' </span><span class="nv">$host$path</span><span class="s2"> >> </span><span class="nv">$ab_output_with_redis</span><span class="s2">
ab -n </span><span class="nv">$total_ab_requests</span><span class="s2"> -c </span><span class="nv">$parallel_ab_requests</span><span class="s2"> -H 'zumo-api-version: 2.0.0' </span><span class="nv">$host$path_without_redis</span><span class="s2"> >> </span><span class="nv">$ab_output_without_redis</span><span class="s2">
done < </span><span class="nv">$redis_keys_file</span><span class="s2">
# extract mean and standard deviation figures for each iteration
output_files=( </span><span class="nv">$ab_output_with_redis</span><span class="s2"> </span><span class="nv">$chart_data_with_redis</span><span class="s2"> </span><span class="nv">$ab_output_without_redis</span><span class="s2"> </span><span class="nv">$chart_data_without_redis</span><span class="s2">)
for index in 0 2;
do
output_figures="</span><span class="si">$(</span><span class="nb">grep</span> <span class="s1">'Total:'</span> <span class="k">${</span><span class="nv">output_files</span><span class="p">[</span><span class="nv">$index</span><span class="p">]</span><span class="k">}</span><span class="si">)</span><span class="s2">" # the line including mean and standard deviation
[ -e </span><span class="k">${</span><span class="nv">output_files</span><span class="p">[</span><span class="nv">$index</span><span class="p">+1]</span><span class="k">}</span><span class="s2"> ] && rm </span><span class="k">${</span><span class="nv">output_files</span><span class="p">[</span><span class="nv">$index</span><span class="p">+1]</span><span class="k">}</span><span class="s2">
echo "</span><span class="nv">mean</span><span class="p"> sd url</span><span class="s2">" > </span><span class="k">${</span><span class="nv">output_files</span><span class="p">[</span><span class="nv">$index</span><span class="p">+1]</span><span class="k">}</span><span class="s2"> # heading
i=0
while IFS= read -r figure; do
figures_array=(</span><span class="nv">$figure</span><span class="s2">) # split the figures
printf "</span><span class="p">%s %s %s\n</span><span class="s2">" </span><span class="k">${</span><span class="nv">figures_array</span><span class="p">[2]</span><span class="k">}</span><span class="s2"> </span><span class="k">${</span><span class="nv">figures_array</span><span class="p">[3]</span><span class="k">}</span><span class="s2"> </span><span class="nv">$i</span><span class="s2"> >> </span><span class="k">${</span><span class="nv">output_files</span><span class="p">[</span><span class="nv">$index</span><span class="p">+1]</span><span class="k">}</span><span class="s2"> # append the figures + url index
((i++))
done <<< "</span><span class="nv">$output_figures</span><span class="s2">"
done</span></code></pre></figure>
<h2 id="test-results">Test Results</h2>
<p>The output of the previous step is a set of 2 files, representing the latency for the two solutions, with and without cache.
The format of the files is as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mean sd url
1452 175.7 0
1429 96.6 1
1541 287.6 2
1224 57.6 3
1241 153.0 4
...
</code></pre></div></div>
<h3 id="charting-the-results">Charting the results</h3>
<p>Last, a small script in R plots, in the same graph, the data sourced from the two files:</p>
<figure class="highlight"><pre><code class="language-r" data-lang="r"><span class="n">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">RColorBrewer</span><span class="p">)</span><span class="w">
</span><span class="c1"># Load the data</span><span class="w">
</span><span class="n">withRedis</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read.table</span><span class="p">(</span><span class="s2">"./chart_data_with_redis.txt"</span><span class="w"> </span><span class="p">,</span><span class="w"> </span><span class="n">header</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
</span><span class="n">withoutRedis</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">read.table</span><span class="p">(</span><span class="s2">"./chart_data_without_redis.txt"</span><span class="w"> </span><span class="p">,</span><span class="w"> </span><span class="n">header</span><span class="o">=</span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
</span><span class="n">withRedis</span><span class="w"> </span><span class="o">$</span><span class="n">type</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"with Redis"</span><span class="w">
</span><span class="n">withoutRedis</span><span class="w"> </span><span class="o">$</span><span class="n">type</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="s2">"without Redis"</span><span class="w">
</span><span class="n">A</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rbind</span><span class="p">(</span><span class="n">withRedis</span><span class="p">,</span><span class="w"> </span><span class="n">withoutRedis</span><span class="p">)</span><span class="w">
</span><span class="c1"># Plot the data</span><span class="w">
</span><span class="n">ggplot</span><span class="p">(</span><span class="n">data</span><span class="o">=</span><span class="n">A</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">x</span><span class="o">=</span><span class="n">url</span><span class="p">,</span><span class="w"> </span><span class="n">y</span><span class="o">=</span><span class="n">mean</span><span class="p">,</span><span class="w"> </span><span class="n">ymin</span><span class="o">=</span><span class="n">pmax</span><span class="p">(</span><span class="n">mean</span><span class="o">-</span><span class="n">sd</span><span class="p">,</span><span class="w"> </span><span class="m">0</span><span class="p">),</span><span class="w"> </span><span class="n">ymax</span><span class="o">=</span><span class="n">mean</span><span class="o">+</span><span class="n">sd</span><span class="p">,</span><span class="w"> </span><span class="n">fill</span><span class="o">=</span><span class="n">type</span><span class="p">,</span><span class="w"> </span><span class="n">linetype</span><span class="o">=</span><span class="n">type</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_line</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">geom_ribbon</span><span class="p">(</span><span class="n">alpha</span><span class="o">=</span><span class="m">0.5</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">xlab</span><span class="p">(</span><span class="s2">"Url ID"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">ylab</span><span class="p">(</span><span class="s2">"Time (ms)"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">theme</span><span class="p">(</span><span class="n">legend.title</span><span class="o">=</span><span class="n">element_blank</span><span class="p">())</span><span class="w"> </span><span class="o">+</span><span class="w">
</span><span class="n">scale_fill_brewer</span><span class="p">(</span><span class="n">palette</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Set1"</span><span class="p">)</span></code></pre></figure>
<p>Here are the charts, produced by running the bash script with different values of <em>total_ab_requests</em> and <em>parallel_ab_requests</em> (assigned to AB’s <em>n</em> and <em>c</em> parameters respectively):</p>
<figure>
<img src="/images/2021-05-12-improving-system-performance-with-redis/Rplot_n10_c1.jpg" alt="Total 10 requests, 1 in parallel" />
<figcaption>total_ab_requests = 10, parallel_ab_requests = 1</figcaption>
</figure>
<figure>
<img src="/images/2021-05-12-improving-system-performance-with-redis/Rplot_n10_c10.jpg" alt="Total 10 requests, 10 in parallel" />
<figcaption>total_ab_requests = 10, parallel_ab_requests = 10</figcaption>
</figure>
<figure>
<img src="/images/2021-05-12-improving-system-performance-with-redis/Rplot_n20_c4.jpg" alt="Total 20 requests, 4 in parallel" />
<figcaption>total_ab_requests = 20, parallel_ab_requests = 4</figcaption>
</figure>
<figure>
<img src="/images/2021-05-12-improving-system-performance-with-redis/Rplot_n100_c20.jpg" alt="Total 100 requests, 20 in parallel" />
<figcaption>total_ab_requests = 100, parallel_ab_requests = 20</figcaption>
</figure>
<h2 id="conclusion">Conclusion</h2>
<p>Introducing a cache in the mid-tier layer has reduced the backend latency by a factor ranging from a few times to hundreds of times, depending on the degree of parallelism of the requests.
It is worth noting, that with the increase of parallel requests, the standard deviation increases too, reducing the confidence of fulfilling the client requests in a deterministic time frame.
In the updates to the Node.js service described previously, Redis was used as a pure key-value store. Further enhancements will be explored, to exploit its capability to accept and manipulate data structures hosted in the values.</p>
<p><a href="https://capgemini.github.io/engineering/improving-system-performance-with-redis/">Improving system performance with Redis</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 25, 2021.</p>https://capgemini.github.io/aws/My-Experience-With-The-New-AWS-SysOps-Certification-Exam-SOA-C022021-05-21T00:00:00+01:002021-05-21T00:00:00+01:00Matt Antleyhttps://capgemini.github.io/authors#author-matt-antley
<p>Earlier this year while it was available I sat an online proctored exam for the new Beta version of the AWS Certified SysOps Administrator - Associate Certification exam (SOA-C02), for more information you can check out the <a href="https://d1.awsstatic.com/training-and-certification/docs-sysops-associate/AWS-Certified-SysOps-Administrator-Associate_Exam-Guide_C02.pdf">Exam Guide</a>. These are some thoughts on why I decided to sit the Beta exam, how I went about revising (including topics) and some info about the exam itself.</p>
<p><img src="/images/2021-05-18-AWS-SysOps-Certification/cap-aws-sysops.png" alt="AWS SysOps Certification Logo" class="centered" /></p>
<h2 id="why-the-beta-exam">Why the Beta exam</h2>
<p>I have been asked why I decided to sit the Beta version of the AWS SysOps exam and the main reasons for it are:</p>
<ul>
<li>It included questions about newer AWS services - This made it so I had to revise and learn about these new services which in turn brings me more up-to-date with what AWS have released recently and to better understand their offerings</li>
<li>Different exam questions - Previously the exam only included multiple choice and multi-answer style questions on its exam however with the SOA-C02 exam it also included a set of hands-on labs which I wanted to try out</li>
</ul>
<p>This was an opportunity to learn about the newer AWS services that made up the exam as well as take a look into the new format of the exam with the inclusion of the hands-on labs. The only drawback to all this is waiting for up to 90 days after the Beta exam period has ended for your results. As of writing this I am still waiting for my results!</p>
<h2 id="revision">Revision</h2>
<p>Last year in January I had taken and passed the AWS Solutions Architect Associate exam so was familiar with a number of the core services that AWS offered already. On top of that I have around 3 years of experience with AWS so a lot of my revision was aimed at those services I do not tend to use or have not had any exposure with.</p>
<p>A list of the services that I identified and focused my revised around were</p>
<ul>
<li>Storage gateway</li>
<li>Backup</li>
<li>Api Gateway</li>
<li>AWS Orgs</li>
<li>CloudFormation</li>
<li>CloudTrail</li>
<li>Config</li>
<li>ElasticSearch</li>
<li>RAM</li>
<li>SSO</li>
<li>ACM</li>
<li>KMS</li>
<li>Directory Service</li>
<li>WAF & Shield</li>
<li>Cost Explorer</li>
<li>Step Functions</li>
<li>EventBridge (CloudWatch Events)</li>
</ul>
<p>Initially identifying what services you are least confident with and focus on building knowledge up in these areas for the exam is a good starting point for revision, it allows you to be more focused when you do sit down to revise as you have a list of topics to look into always available. On top of this I would also do a refresher on other services like EC2, Auto Scaling, Load Balancing, RDS, etc, but a lot of the focus initially was on these newer or less familiar services.</p>
<p>After doing some research around the new exam, I found a few articles which spoke about it including Ardian Cantrill’s <a href="https://www.linkedin.com/pulse/how-prepare-upcoming-aws-certified-sysops-associate-adrian-cantrill/">How to prepare for the upcoming AWS Certified SysOps Administrator - Associate (SOA-C02) Exam</a> as well as his thoughts on the exam after he had taken it with <a href="https://adriancantrill.medium.com/my-thoughts-on-the-sysops-administrator-associate-beta-exam-soa-c02-db34d31d8e3">My Thoughts On the SysOps Administrator Associate BETA Exam — (SOA-C02)</a></p>
<p>I later found a course of his that specifically aims itself at the new AWS SysOps Administrator exam instead of the older exam. I was quite cautious of any courses or resources that had been rebranded as compatible for the new exam as I thought they wouldn’t be as up-to-date or wouldn’t include courses on the latest AWS services. I opted to give Adrian’s course <a href="https://learn.cantrill.io/p/aws-certified-sysops-administrator-associate">AWS Certified SysOps Administrator - Associate</a> a try after viewing some of the free videos that were available on the course.</p>
<p>What I primarily liked about his course was the use of graphics to describe a lot of how these AWS services function. As I’m not the most confident of readers and reading isn’t how I would best take on new information, but when conveyed with images and his accompanying demos it helped the information stick and was made more understandable by this approach.</p>
<figure>
<img src="/images/2021-05-18-AWS-SysOps-Certification/adrian-cantrill-example-image.png" class="centered" alt="Adrian Cantrill SysOps Course" />
<figcaption>An example of the images used in Adrian Cantrill's courses</figcaption>
</figure>
<p>After building up some confidence with my revision I started to take practice exams provided from <a href="https://www.whizlabs.com/aws-sysops-administrator-associate/">WhizLabs</a> as I found the practice exams on there for my Architect exam quite useful in the past. On reflection, the WhizLabs practice exams did not seem up-to-date with the most recent AWS services which were not covered in the practice exams I covered, with that said they were still beneficial in testing my knowledge and allowed me to identify areas where I needed to improve. Next time I plan on taking a look at the practice exams provided by <a href="https://tutorialsdojo.com/">Tutorials Dojo</a> as I have heard good things about them and want to try and find more practice exam resources.</p>
<h2 id="the-exam">The Exam</h2>
<p>The online proctored exam was well setup, I had to sign in, confirm my identity and then had to take some photos of my desk and surrounding area for inspection to ensure there were no materials I could use to cheat or gain an advantage in the exam. I was then called by the proctor to confirm a few last minute details and informed I had to remove the coaster and some pens from my desk. Once all this was done, they wished me luck and the exam began after accepting the Terms and Conditions of the exam. Because this was a Beta exam, I had a total time of 3 hours and 45 minutes to complete all sections of the exam.</p>
<p>Then, the exam started. It was was split into 2 parts, the first involving the multiple choice and multi-answer questions and the second part involved the hands-on labs portion of the exam. The first part of the exam contained 53 questions. Any I was unsure of I marked for review to come back to later and as I am not the fastest of readers I also tend to leave the larger & more wordy questions until last and focus on the shorter questions to begin with, this approach works well for me at least. Once I had been through all the questions and got to the end, I had a chance to review all the questions I had marked for review earlier in the exam. I still had around 3 hours left at point in time so I taken my time carefully reading the larger questions to ascertain what they were asking and answer to the best of my ability.</p>
<p>After completing the review, I marked the section as complete. You are then warned you will not be able to come back to this section after completing it so you cannot get answers from the AWS console when the hands-on labs start. I accepted and continued onto the 2nd part of the exam that focused on the labs.</p>
<p>Getting onto the lab section of the exam there was about 2 hours and 30 minutes remaining and it was recommended that you spent 20 minutes on each of the labs you were given. Sadly, I am not allowed to say exactly what I had to do in the hands-on Labs but the areas in which my questions focused on were:</p>
<ul>
<li>High availability VPC</li>
<li>CloudWatch Alarms</li>
<li>CloudFormation</li>
</ul>
<p>The Labs were setup through a virtual Windows machine for access to a browser that was set to the AWS console. You followed a set of instructions to provision or amend some resources to specification and once you were done you could move on to the next Lab. I felt the Labs were well implemented and apart from some issues copy/pasting from the exam portal (OnVue) into the Windows virtual machine I had no complaints and thought it was well done.</p>
<p>Once you had finished one Lab you could progress to the next but were not allowed to return to previous labs. During the labs you were given access to the whole AWS console (Although I read that navigating the browser page away would give a warning) so you were not able to look at AWS documentation, for instance. You also had the option use AWS CloudShell if you wanted to provision resources via CLI instead of via the console, for ease I just used the console.</p>
<p>Each of the Labs taken me around 20-25 minutes each but as I had the time I didn’t rush and went through the specification of what was asked thoroughly before going onto the next lab. With just over an hour remaining on the exam I confirmed the completion of the labs and submitted the exam.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Firstly, I genuinely enjoyed taking this exam. The inclusion of the hands-on lab style questions are a huge positive in my books as it means you need to know AWS from a practical standpoint as well as a theoretical one. In the future I would like to see this become a staple that AWS exams include some hands-on Lab element to test the candidate in a different way.</p>
<p>I’m happy with how the online proctored exam was carried out, sign in and setup was quick and easy from my experience and this will likely be how I take my exams going forward.</p>
<p>The new SOA-C02 SysOps exam will be available to register from the 29th of June and the old SOA-C01 SysOps exam will be retired as of the 26th of July. If you are looking to take the new SysOps exam you can find more information on the <a href="https://aws.amazon.com/certification/coming-soon/">Coming Soon to AWS Certification</a> page.</p>
<p><a href="https://capgemini.github.io/aws/My-Experience-With-The-New-AWS-SysOps-Certification-Exam-SOA-C02/">My experience with the new AWS SysOps Certification exam (SOA-C02)</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on May 21, 2021.</p>https://capgemini.github.io/engineering/user-privacy-and-data-use-in-ios-142021-03-05T00:00:00+00:002021-03-05T00:00:00+00:00Riccardo Freschihttps://capgemini.github.io/alumni#author-riccardo-freschi
<h2 id="background">Background</h2>
<p>Digital Advertising is a form of marketing which uses the Internet to deliver promotional messages to consumers.</p>
<p>In the Digital Advertising ecosystem there are 3 main actors:</p>
<ul>
<li>consumers, who are the recipients of the messages</li>
<li>advertisers, who are the entities willing to spread a specific message about their service or product</li>
<li>publishers, who own the space to display an advertiser’s message.</li>
</ul>
<p>Consumers are all of us, people who might be interested in a product, hence the target of the value proposition.</p>
<p>Advertisers can be anything from physical people to brick and mortar shops to gaming companies or big corporations, etc.</p>
<p>Publishers can be website owners interested in selling an area of a webpage to display an advertisement, or they can be social networks, search engines or any other entity with a Web presence.
Publishers can also act as advertisers and vice versa: think of a website which hosts ad space and at the same time promotes its service or product on other platforms.</p>
<p>In between publishers and advertisers there are a number of second level entities which make the ad space fulfilment possible. They jointly form the so-called advertising technology (AdTech) stack.
Demand Side Platforms (DSPs), Supply Side Platforms (SSPs), Ad Exchanges and Ad Networks form the core of the AdTech stack.</p>
<p>SSPs are platforms enabling publishers to manage, sell and optimize their available inventory. On the opposite side of the spectrum we find DSPs. DSPs allow advertisers working at brands and ad agencies to buy inventory on an impression-by-impression basis from SSPs.</p>
<p>Ad Exchanges are the actual digital marketplaces in between DSPs and SSPs where the purchase of a given ad space happens, typically via real-time bidding (RTB) auctions.</p>
<p>Ad Networks are brokers of inventory and also generally placed between DSPs and SSPs.</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/adtechhighlevel.jpg" alt="AdTech stack high level" /></p>
<p>The criteria governing the buying and selling of advertising space are not limited to price comparison: advertisers are interested in optimising their investment, which means spending the least for the maximum chance of converting a prospect into a customer.</p>
<p>Such conversion is much more likely if the profile of the buyer persona and the profile of the candidate consumer match. The buyer persona describes the customer archetype. The candidate consumer is also called the “target”. And “targeted advertising” is the first controversial actor of our story.</p>
<p>Targeted advertising is the technique of directing the messaging towards an audience with certain traits, based on the product the advertiser is promoting. These traits can be anything from demographic, to income, to personality, to lifestyle, etc., gathered via many different means, altogether going by the name of “tracking”.</p>
<p>By itself, targeted advertising would not be so bad: doesn’t everyone prefer to be bothered with propositions of products more relevant to their interests rather than not?</p>
<p>The problem is tracking.</p>
<p>Technically, tracking refers to the act of collecting user or device data from a website or a mobile application (commonly referred to as an “app”) and linking it with other data collected from other companies’ apps, websites, or offline properties.</p>
<p>Tracking also refers to sharing the collected data with Data Brokers, which are companies whose primary business is collecting personal information about consumers from a variety of sources and aggregating, analysing, and sharing that information, or information derived from it. Data Brokers are the source of consumers’ profile information, which forms the foundation of advertising digital auctions: whenever a consumer visits a webpage hosting an ad, before presenting that same ad, a number of events take place in the AdTech stack, leading to a series of bids to purchase the ad space; such bids are based on the profile of the consumer provided by Data Brokers. It is fair to say though, that not all ad space is sold within the full AdTech stack and hence involving profile information sourced from Data Brokers: there are environments where the stack is squashed into a single platform, think of e.g. the usual suspects Google and Facebook, where advertisers can acquire space directly from them, who in such a case play the role of publishers, SSP, Ad Network… basically the full stack.
It is interesting to note that even incumbents like Facebook rely on Data Brokers for profiling: in <a href="https://www.forbes.com/sites/kalevleetaru/2018/04/05/the-data-brokers-so-powerful-even-facebook-bought-their-data-but-they-got-me-wildly-wrong/?sh=1a0d04e63107">a report from Forbes</a></p>
<blockquote>
<p>Facebook argues that it must buy this data because that is simply how advertising is done today and that companies want to use the same marketing selectors across every platform.</p>
</blockquote>
<p>Coming back to tracking, <a href="https://developer.apple.com/app-store/user-privacy-and-data-use/">according to Apple</a>, examples of tracking include:</p>
<blockquote>
<ul>
<li>Displaying targeted advertisements in your app based on user data collected from apps and websites owned by other companies.</li>
<li>Sharing device location data or email lists with a data broker.</li>
<li>Sharing a list of emails, advertising IDs, or other IDs with a third-party advertising network that uses that information to retarget those users in other developers’ apps or to find similar users.</li>
<li>a third-party SDK in your app that combines user data from your app with user data from other developers’ apps to target advertising or measure advertising efficiency, even if you don’t use the SDK for these purposes. For example, using a login SDK that repurposes the data it collects from your app to enable targeted advertising in other developers’ apps.
The following situations are not considered tracking:</li>
<li>When the data is linked solely on the end-user’s device and is not sent off the device in a way that can identify the end-user or device.</li>
<li>When the data broker uses the data shared with them solely for fraud detection or prevention or security purposes, and solely on your behalf.</li>
</ul>
</blockquote>
<p>SDKs (Software Development Kits) are third party software components embedded in apps, which implement a large variety of pieces of functionality. Because they’re useful and generally easy to use, SDKs are embedded in lots of the published apps. A comprehensive list and description of the most used SDKs is maintained by <a href="https://mightysignal.com/top-ios-sdks">MightySignal</a>.</p>
<p>A number of studies from accredited government institutions and news media has brought to the public attention that:</p>
<p><a href="https://arxiv.org/pdf/1804.03603.pdf">University of Oxford</a>:</p>
<blockquote>
<p>A very large number of apps embed third party SDKs, which form networks that link activity across multiple apps to a single user, and also link to their activities on other devices or mediums like the web. This enables construction of detailed profiles about individuals, which could include inferences about shopping habits, socio-economic class or likely political opinions.</p>
</blockquote>
<p><a href="https://www.aeaweb.org/articles?id=10.1257/jel.54.2.442">Journal of Economic Literature</a>:</p>
<blockquote>
<p>consumers’ ability to make informed decisions about their privacy is severely hindered because consumers are often in a position of imperfect or asymmetric information regarding when their data is collected, for what purposes, and with what consequences.</p>
</blockquote>
<p><a href="https://www.nytimes.com/interactive/2019/12/19/opinion/location-tracking-cell-phone.html">New York Times</a>, reporting on location tracking:</p>
<blockquote>
<p>[the data reviewed] originated from a location data company, one of dozens quietly collecting precise movements using software slipped onto mobile phone apps.
[…]
The companies that collect all this information on your movements justify their business on the basis of three claims: People consent to be tracked, the data is anonymous and the data is secure. None of those claims hold up, based on the file we’ve obtained and our review of company practices. Yes, the location data contains billions of data points with no identifiable information like names or email addresses. But it’s child’s play to connect real names to the dots that appear on the maps.
[…]
Describing location data as anonymous is “a completely false claim” that has been debunked in multiple studies, Paul Ohm, a law professor and privacy researcher at the Georgetown University Law Center, told us. “Really precise, longitudinal geolocation information is absolutely impossible to anonymize.”
[…]
If you have an S.D.K. that’s frequently collecting location data, it is more than likely being resold across the industry,” said Nick Hall, chief executive of the data marketplace company VenPath.
[…]
If a private company is legally collecting location data, they’re free to spread it or share it however they want,” said Calli Schroeder, a lawyer for the privacy and data protection company VeraSafe.</p>
</blockquote>
<p><a href="https://arxiv.org/pdf/1804.03603.pdf">University of Oxford</a>:</p>
<blockquote>
<p>[…] most apps [959,000 apps from the US and UK Google Play stores] contain third party tracking, and the distribution of trackers is long-tailed with several highly dominant trackers accounting for a large portion of the coverage.
[…]
the median number of tracker hosts included in the bytecode of an app was 10. 90.4% of apps included at least one, and 17.9% more than twenty.</p>
</blockquote>
<p><a href="https://fil.forbrukerradet.no/wp-content/uploads/2020/01/2020-01-14-out-of-control-final-version.pdf">The Consumer Council of Norway</a>, following an investigation:</p>
<blockquote>
<ul>
<li>20 months after the GDPR has come into effect, consumers are still pervasively tracked and profiled online and have no way of knowing which entities process their data and how to stop them.</li>
<li>The adtech industry is operating with out of control data sharing and processing, despite that should limit most, if not all, of the practices identified throughout this report.</li>
<li>The digital marketing and adtech industry has to make comprehensive changes in order to comply with European regulation, and to ensure that they respect consumers’ fundamental rights and freedoms.</li>
</ul>
</blockquote>
<p><a href="https://www.forbes.com/sites/kalevleetaru/2018/04/05/the-data-brokers-so-powerful-even-facebook-bought-their-data-but-they-got-me-wildly-wrong">Forbes</a>:</p>
<blockquote>
<p>In the world of Data Brokers, you have no idea who all has bought, acquired or harvested information about you, what they do with it, who they provide it to, whether it is right or wrong or how much money is being made on your digital identity. Nor do you have the right to demand that they delete their profile on you.</p>
</blockquote>
<p>Consequently, government regulators have been taking action by sanctioning those found in breach and by setting rules on personal information handling.</p>
<p>Some of the most notable penalty examples are:</p>
<p><a href="https://www.ftc.gov/news-events/press-releases/2019/07/ftc-imposes-5-billion-penalty-sweeping-new-privacy-restrictions">Facebook</a>:</p>
<blockquote>
<p>Facebook, Inc. will pay a record-breaking $5 billion penalty [… for] deceiving users about their ability to control the privacy of their personal information.</p>
</blockquote>
<p><a href="https://www.wsj.com/articles/twitter-could-pay-ftc-fine-over-alleged-privacy-violations-11596501001">Twitter</a>:</p>
<blockquote>
<p>the FTC [alleged Twitter] used phone numbers and email addresses that were given to the company for safety and security purposes for targeted advertising between 2013 and 2019.</p>
</blockquote>
<p><a href="https://www.cnil.fr/en/cnils-restricted-committee-imposes-financial-penalty-50-million-euros-against-google-llc">Google</a>:</p>
<blockquote>
<p>On 21 January 2019, the CNIL’s restricted committee imposed a financial penalty of 50 Million euros against the company Google LLC, in accordance with the General Data Protection Regulation (GDPR), for lack of transparency, inadequate information and lack of valid consent regarding the ads personalization.</p>
</blockquote>
<p>As far as the introduction of data protection regulations is concerned, two of the most prominent examples are the California Consumer Privacy Act (CCPA) and the European General Data Protection Regulation (GDPR).</p>
<p>CCPA is based on four <a href="https://oag.ca.gov/privacy/ccpa">founding principles</a>, which state that consumers have:</p>
<blockquote>
<ul>
<li>The right to know about the personal information a business collects about them and how it is used and shared;</li>
<li>The right to delete personal information collected from them (with some exceptions);</li>
<li>The right to opt-out of the sale of their personal information; and</li>
<li>The right to non-discrimination for exercising their CCPA rights.</li>
</ul>
</blockquote>
<p>Similarly, GDPR sets out <a href="https://gdpr-info.eu/">seven key principles</a>, which lie at the heart of its general data protection regime:</p>
<blockquote>
<ul>
<li>Lawfulness, fairness and transparency</li>
<li>Purpose limitation</li>
<li>Data minimisation</li>
<li>Accuracy</li>
<li>Storage limitation</li>
<li>Integrity and confidentiality (security)</li>
<li>Accountability</li>
</ul>
</blockquote>
<p>The underlying theme can then be summarised as:</p>
<ul>
<li>Reduce consumer data collection to the strictly necessary</li>
<li>Be transparent on why you need it and on what you do with it</li>
<li>Be careful of how you handle it, you are accountable for that</li>
</ul>
<p>Those are basically the same principles adopted by Apple in its approach to privacy.</p>
<h2 id="privacy-on-ios-14">Privacy on iOS 14</h2>
<h3 id="before-ios-14">Before iOS 14</h3>
<p>Over the years, Apple has distinguished itself for paying special attention to making their products more privacy friendly and helping keeping users’ identity and data more secure.</p>
<p>On Safari, the main feature introduced to hinder tracking is Intelligent Tracking Prevention (ITP), an initiative started by Apple in 2017.</p>
<p>In the browser context, tracking is achieved historically by third party components (similar to the SDKs described above) dropping small bites of information called cookies in the browser of the consumer. Frequently those cookies carry just one piece of information: a constant identifier assigned to the consumer (or better her/his browser); the rest of the profile is stored in a server which the component communicates directly or indirectly with, via Data Brokers, to incrementally add information, while the consumer visits other sites also embedding the same component.</p>
<p>Back to ITP: Safari started first with blocking third-party cookies, then later on, tightened the noose around client side first-party cookies too, by putting a 7-day expiration date on them. This change was added to iOS 12.2 and Safari 12.1 on macOS High Sierra and Mojave.</p>
<p>First-party cookies are cookies added by the website the consumer is visiting. Placing tracking information in first-party cookies is a workaround recently developed (for good or bad) by some companies to overcome the limitations caused by the blockage of third-party cookies (<a href="https://www.facebook.com/business/help/471978536642445?id=1205376682832142">e.g. Facebook</a>).</p>
<p>More recently, with version 14, released in September 2020, Apple incremented ITP with a “Privacy Report”, listing all trackers Safari detected during the consumer’s site visits.</p>
<p>On the mobile side, Apple introduced the first advertising-related features in 2012, with iOS 6.</p>
<p>First it removed the Unique Device Identifier (UDID), a constant identifier associated to the device, previously always available to apps and playing a role similar to the tracking cookie in the browser context.</p>
<p>Second, it put in place another device identifier, named Identifier for Advertisers (IDFA), which is a string commonly represented by numbers and letters (technically a 128-bit value, called a UUID). Differently from UDID, the IDFA can be made unavailable to apps if the third new feature is switched on by the user: a feature called Limit Ad Tracking (LAT).</p>
<p>When LAT is enabled, the user’s IDFA is zeroed out (i.e., the value is replaced with zeros) when accessed by apps, hence hiding the device identity.</p>
<p>In reality, prior to iOS 10, the IDFA was still passed, even if a user had enabled LAT, but was accompanied with a request not to use the IDFA. Many companies decided not to honour this request, so Apple decided to zero out the IDFA from iOS 10 onwards.</p>
<figure>
<img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/iPhone1.png" alt="iPhone LAT" class="centered medium-8" />
<figcaption>All iPhone images taken from apple.com</figcaption>
</figure>
<p>More recently Apple enabled the users to opt-out of Location-Based Apple Ads: with opt-out disabled, if a user granted the App Store or Apple News access to her/his device location, Apple’s advertising platform would use the current location of the device to provide with geographically targeted ads on the App Store and on Apple News apps.</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/iPhone2.jpg" alt="iPhone Ad Targeting" /></p>
<p>iOS 13 brought with it an update to location data controls.</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/iPhone3.jpg" alt="iPhone Location Services" /></p>
<p>Firstly, users were periodically shown messages informing them of certain apps that were using their location data in the background (i.e., when not actually using the app in question).</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/iPhone4.jpg" alt="iPhone Location tracking in background" /></p>
<p>Secondly, Apple changed the options available when users were presented with the popup to choose whether an app could use their location data: the original options were updated from “Always, Never, and While using” to “Allow While Using App, Allow Once, and Don’t Allow”.</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/iPhone5.jpg" alt="iPhone Maps location access" /></p>
<p>Other privacy-related additions to iOS 13 were the permission to use Bluetooth and the permission to read Contacts’ notes: before, apps could access those functionalities freely, while with the new OS, the user was prompted to approve.</p>
<p>That leads us to the present day and to the controverted protagonist of the present perspective: iOS 14 (14.5 in its latest incarnation, Beta released on February the 4th, 2021).</p>
<h3 id="after-ios-14">After iOS 14</h3>
<p>With iOS 14, Apple delivers a number of new <a href="https://www.apple.com/ios/ios-14/features">privacy-related features</a>:</p>
<p><img src="/images/2021-02-19-user-privacy-and-data-use-in-ios-14/rel-notes-privacy.png" alt="iOS 14 Privacy-related features" /></p>
<p>The most debated ones are the two at the top of the list above: “Privacy information on the App Store” and “App tracking controls and transparency”.</p>
<p>“Privacy information on the App Store” states that in order to submit new apps and app updates, application publishers must provide information about their privacy practices in App Store Connect. If their apps use third-party code, such as advertising or analytics SDK’s, they also need to describe what data the third-party code collects, how the data may be used, and whether the data is used to track users, unless the captured data meets all of the criteria for optional disclosure listed below:</p>
<ul>
<li>The data is not used for tracking, which means that the data is not linked with Third-Party Data for advertising or advertising measurement or shared with a data broker.</li>
<li>The data is not used for Third-Party Advertising, for the app publisher’s Advertising or Marketing purposes, or for Other Purposes</li>
<li>Collection of the data occurs only in infrequent cases that are not part of the app’s primary functionality, and which are optional for the user.</li>
<li>The data is provided by the user via the app’s interface, it is clear to the user what data is collected, the user’s name or account name is prominently displayed in the submission form alongside the other data elements being submitted, and the user affirmatively chooses to provide the data for collection each time.</li>
</ul>
<p>Some data related to apps in the Regulated Financial Services and Health Research can optionally disclose the collected data, provided some extra criteria are met.</p>
<p>In its guidance, Apple also provides definitions for the different types of data, such as “Email Address” and definitions for data use purposes, such as “Third-Party Advertising”, to help app publishers understand what kind of data falls within which policy.</p>
<p>For every type of captured data, Apple requires app publishers to identify if it is linked to the user’s identity (via their account, device, or other details) either by the app publishers themselves or by their partners. Data collected from an app is considered linked to the user’s identity, unless privacy protections are put in place before collection, to anonymize it, such as stripping data of any direct identifiers (e.g., user ID or name) before collection.
Additionally, after collection, data must not be linked back to the user’s identity, either directly or tied to other datasets that enable it to be linked indirectly.</p>
<p>On the second privacy update, “App tracking controls and transparency”: app publishers need to receive the user’s permission through the AppTrackingTransparency framework to</p>
<ul>
<li>access the device’s IDFA or</li>
<li>(in general) track them.</li>
</ul>
<p><strong>This is the big change</strong>: if previously, like seen above, the IDFA was zeroed in case of LAT enabled only, now with iOS 14 (14.5 exactly) it is always so, unless the app receives first the user approval by requesting it via the AppTrackingTransparency framework. Such a request results in the user being presented a popup and prompted to grant the app access to the IDFA. (The popup can be customised with a purpose string to add more information about why the app needs to access the identifier.)</p>
<p><strong>The change will specifically affect the ad targeting side of the ecosystem, in all its declinations: segmentation, retargeting, lookalike audiences, exclusion targeting, etc.</strong></p>
<p>Today in fact, a large number of advertising platforms relies on the IDFA, e.g. the <a href="https://developers.google.com/admob/ios/download">Google Mobile Ads SDK</a>:</p>
<blockquote>
<p>The Mobile Ads SDK for iOS utilizes Apple’s advertising identifier</p>
</blockquote>
<p>The change goes actually even further than fencing access to the IDFA: from Apple’s FAQ section we understand that the following practices might result in App Store rejection:</p>
<ul>
<li>gating functionalities or incentivising the user to grant tracking permission</li>
<li>using another identifier (e.g., a hashed email address or hashed phone number), unless permission is granted through the AppTrackingTransparency framework</li>
<li>fingerprinting or using signals from the device to try identifying the device or a user</li>
<li>tracking performed by an integrated third-party SDK, even in case of single sign-on (SSO) SDK</li>
</ul>
<p>It is evident then, that any attempt of unexplicitly granted tracking would not be tolerated by Apple and that the app publisher is deemed fully responsible for the code running in her/his app, even for the code running in embedded SDK’s, produced by third parties.</p>
<p>Content providers owning multiple apps and willing to apply analytics across them, have the option to use another ID, the ID for Vendors (IDFV), without obligation to request user’s permission via the AppTrackingTransparency framework. Again though, only in case the IDFV is not combined with other data to track a user across apps and websites owned by other companies. In that case, permission needs still to be granted via the AppTrackingTransparency.</p>
<p>A piece of functionality which is also affected by the privacy changes is “attribution”: whenever an app is installed as a consequence of the user tapping on an advertisement on another app, a common practice today is to leverage the IDFA to detect which ad on which device resulted in the conversion and hence measure the effectiveness of the advertising campaign.
Apple guidelines recommend adopting to the SKAdNetwork framework instead, which the AppTrackingTransparency grant is not required for.</p>
<p>In another note Apple announces the <a href="https://webkit.org/blog/11529/introducing-private-click-measurement-pcm">upcoming support for Private Click Measurement</a>, that facilitates advertising networks measuring the effectiveness of advertisement clicks within iOS or iPadOS apps that navigate to a website. This might be a welcomed change by the advertising business, considering that the IDFA is not available on the browser, hence tracing the conversion from an ad tapped on mobile and directing to a web page was hard before but possible now.</p>
<h2 id="conclusion">Conclusion</h2>
<p>We have gone through an overview of the main different entities operating in Digital Advertising and how targeted advertising in particular heavily leverage tracking to build up consumer profiles, used to compute likelihood of ad conversions.</p>
<p>We have seen that said profiles are frequently built up by capturing and combining data in opaque ways, which raised concerns amongst public opinion and governments, leading to the introduction of regulations such as GDPR.</p>
<p>In the last part, I presented some of Apple’s efforts to support user privacy in both Safari and iOS, opposing tracking in a similar way as other personal data laws around the world attempt doing.
I paid special attention to the latest privacy changes introduced with iOS 14, describing the impact on IDFA usage and App Store publishing process for mobile apps.</p>
<p>I believe that complications for those willing to keep pursuing tracking behind the scenes will be considerable, though possible technical workarounds might be found.</p>
<p>As for those willing to operate in the clear, my recommendation is to apply the updates described in the previous sections.</p>
<p>In summary: <strong>treat users with consent, if tracking is needed ask for permission, disclose transparently what will happen to the consumers’ data and why consumers should agree on making it available, give users easy ways to update and change their preferences, keep control of where the acquired data flows, go even further and adopt anonymization where possible and delete data when not needed anymore</strong>.</p>
<p><a href="https://capgemini.github.io/engineering/user-privacy-and-data-use-in-ios-14/">User Privacy and Data Use in iOS 14</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on March 05, 2021.</p>https://capgemini.github.io/development/elasticsearch-deeper-dive2021-02-26T00:00:00+00:002021-02-26T00:00:00+00:00Kamar Alihttps://capgemini.github.io/authors#author-kamar-ali
<h2 id="introduction">Introduction</h2>
<p>In <a href="https://capgemini.github.io/development/elasticsearch-introduction/">my previous post</a> we looked at getting started with Elasticsearch, covering some basic concepts and getting some hands on.</p>
<p>In this article I want to expand on that, taking a deeper dive and covering the following:</p>
<ul>
<li>Importing large amounts of data</li>
<li>Trimming results</li>
<li>Paging results</li>
<li>Scoring</li>
<li>QueryDSL</li>
</ul>
<p>Before getting started make sure you have Elasticsearch installed and running, details of which can be found in the <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html">official Elasticsearch documentation.</a></p>
<h2 id="importing-data">Importing data</h2>
<p>To import large amounts of data into Elasticsearch we will be using the bulk API, which allows us to index a lot of data with a single API call.</p>
<p>Before doing this we will need some large datasets to import. I used an <a href="https://www.json-generator.com">online JSON generator</a> which can be used to generate datasets required for the work we will be doing.
On the left hand side of the panel you have the generator configuration, I will be using the following configuration, which will generate us 3 JSON objects with the specified fields (you can paste this in).</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[
'{{repeat(3, 3)}}',
{
id: '{{objectId()}}',
eyeColor: '{{random("blue", "brown", "green")}}',
name: '{{firstName()}} {{surname()}}',
gender: '{{gender()}}',
company: '{{company().toUpperCase()}}',
email: '{{email()}}',
phone: '+1 {{phone()}}'
}
]
</code></pre></div></div>
<p>When ready click ‘Generate’ at the top of the screen. This will display your randomly generated JSON documents on the right hand side of the screen.</p>
<p>Before running the bulk import, we will need to make a few tweaks to your JSON documents for the import process to function correctly.</p>
<ul>
<li>Before downloading the data, ensure you have selected ‘Compact’ as the bulk import uses new line characters to determine the end of the file.</li>
<li>Once the data has been downloaded, we will need to remove the square brackets at the beginning and end of the document so we only have our individual JSON objects.</li>
<li>We’ll need to ensure that there is a new line character at the end of each JSON document.</li>
<li>Finally, ensure that you’re adding an ‘action line’ before each document. This simply tells elastic to add an ID to the document, and in particular we will be telling elastic to create an ID for us.</li>
</ul>
<p>An example of what my JSON document ready for bulk import should look like. You’re welcome to copy this and save it as a document called data.json.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{"index":{}}
{"id":"5fbce76b51f6720fb820a80c","eyeColor":"blue","name":"Mcintosh Cooley","gender":"male","company":"example-company-1","email":"mcintoshcooley@example-company-1.com","phone":"+1 (993) 503-3824"}
{"index":{}}
{"id":"5fbce76b0d94b91072f4a988","eyeColor":"blue","name":"Carole Decker","gender":"female","company":"example-company-2","email":"caroledecker@example-company-2.com","phone":"+1 (929) 421-3982"}
{"index":{}}
{"id":"5fbce76b07595151f4100b57","eyeColor":"green","name":"Baxter Andrews","gender":"male","company":"example-company-3","email":"baxterandrews@example-company-3.com","phone":"+1 (911) 448-2944"}
</code></pre></div></div>
<p>Once we have prepared the data, we’ll be using curl to hit the bulk import API.
As the data generated is based on people, we need to create an index type that makes sense, so I decided to model this is as a university course students list: a ‘computer_science’ index with type ‘students’.</p>
<p>In this example the JSON data file I have created is named ‘data.json’</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/computer_science/students/_bulk?pretty' --data-binary @data.json
</code></pre></div></div>
<p>Once you’ve executed the command the console will output data similar to what is shown below:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took" : 423,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "computer_science",
"_type" : "students",
"_id" : "tK7s-XUBlEyYNvSPxirU",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1,
"status" : 201
}
}...
]
}
</code></pre></div></div>
<h2 id="trimming-search-results">Trimming search results</h2>
<p>We can now search on our stored students data. The following command will query for all students who have blue eyes:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue"
</code></pre></div></div>
<p>And the result of the query is as follows:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 11,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b51f6720fb820a80c",
"eyeColor": "blue",
"name": "Mcintosh Cooley",
"gender": "male",
"company": "example-company-1",
"email": "mcintoshcooley@example-company-1.com",
"phone": "+1 (993) 503-3824"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<p>Say we’ve got thousands of results and we’re not interested in the actual data just yet, we can trim the results to remove the <code class="language-plaintext highlighter-rouge">_source</code> of our hits.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue&_source=false"
</code></pre></div></div>
<p>With the result looking a lot cleaner:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 0.4700036
},
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036
}
]
}
}
</code></pre></div></div>
<p>We can continue to trim this down and limit hits using the size parameter.
An example of where we may want to do this, is if we just want to know the total number of students returned from our query.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue&_source=false&size=1"
</code></pre></div></div>
<p>And here’s the output, as you can see, the objects aren’t returned in the hits array.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
</code></pre></div></div>
<h2 id="paging-results">Paging results</h2>
<p>I only have three documents stored in elastic, however we can still demonstrate paging results.</p>
<p>Limits can be added to the results as demonstrated previously, using the ‘size’ parameter.
So if we just want the first result we can use ‘size=1’ and this will get us the first result (from 0). We aren’t sorting yet, so elastic is returning these in an arbitrary order.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue&_source=false&size=1"
</code></pre></div></div>
<p>This will fetch us the first document:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b51f6720fb820a80c",
"eyeColor": "blue",
"name": "Mcintosh Cooley",
"gender": "male",
"company": "example-company-1",
"email": "mcintoshcooley@example-company-1.com",
"phone": "+1 (993) 503-3824"
}
}
]
}
}
</code></pre></div></div>
<p>If we want to get the next document we can use the ‘from’ parameter in conjunction with the ‘size’ parameter. This will tell us to grab x documents from position y.
To get our second result we would use ‘from=1’.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue&size=1&from=1"
</code></pre></div></div>
<p>As you can see we now have our second student:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<p>If we try to go beyond this (we only have two students with blue eyes) we will get a blank result:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl -X GET "localhost:9200/_search?q=blue&size=1&from=2"
</code></pre></div></div>
<p>And the result:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": []
}
}
</code></pre></div></div>
<h2 id="scoring-in-elasticsearch">Scoring in Elasticsearch</h2>
<p>You may have noticed a score field is returned with each document returned by queries in Elasticsearch.
This is the way in which Elasticsearch signals to us how the results rank in terms of relevance to our query, including field matches and any additional configuration we may have used.
The score itself is calculated using the <a href="https://www.elastic.co/guide/en/elasticsearch/guide/current/practical-scoring-function.html">Lucene Practical Scoring Function</a></p>
<h2 id="querydsl">QueryDSL</h2>
<p>QueryDSL (Domain Specific Language) is a framework we can use for more specific and efficient searches by providing our criteria in the request body as JSON.
There are two types of query clauses:</p>
<ul>
<li>Leaf query clause: Looks for a certain value in a particular field</li>
<li>Compound query clause: A combination of leaf queries and other compound queries</li>
</ul>
<h3 id="match-all">Match All</h3>
<p>The most basic query is the <code class="language-plaintext highlighter-rouge">match_all</code> query, which will return everything.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"match_all":{
}
}
}
</code></pre></div></div>
<p>And on running this we get the following result, as you can see all three of our students are returned:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 1.0,
"_source": {
"id": "5fbce76b51f6720fb820a80c",
"eyeColor": "blue",
"name": "Mcintosh Cooley",
"gender": "male",
"company": "example-company-1",
"email": "mcintoshcooley@example-company-1.com",
"phone": "+1 (993) 503-3824"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 1.0,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "tq7s-XUBlEyYNvSPxirV",
"_score": 1.0,
"_source": {
"id": "5fbce76b07595151f4100b57",
"eyeColor": "green",
"name": "Baxter Andrews",
"gender": "male",
"company": "example-company-3",
"email": "baxterandrews@example-company-3.com",
"phone": "+1 (911) 448-2944"
}
}
]
}
}
</code></pre></div></div>
<h3 id="match">Match</h3>
<p>We can also use the <code class="language-plaintext highlighter-rouge">match</code> query to repeat our previously used query for students with blue eyes:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"match":{
"eyeColor":"blue"
}
}
}
</code></pre></div></div>
<p>And as you can see our two students with blue eyes are returned:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 5,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b51f6720fb820a80c",
"eyeColor": "blue",
"name": "Mcintosh Cooley",
"gender": "male",
"company": "example-company-1",
"email": "mcintoshcooley@example-company-1.com",
"phone": "+1 (993) 503-3824"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<h3 id="bool-must-and--filter">Bool, Must (AND) & Filter</h3>
<p>If we want to see how many females with blue eyes are in the class we will need to use a <code class="language-plaintext highlighter-rouge">boolean query</code>.
This matches documents based on boolean combinations of other queries. We will be using the <code class="language-plaintext highlighter-rouge">must</code> clause which basically means that the query clause parameter MUST appear in matching documents.</p>
<p>To achieve what we did in the previous query and collect all students with blue eyes using a boolean query, we would do the following:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"eyeColor":"blue"
}
}
]
}
}
}
</code></pre></div></div>
<p>With the same result:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 20,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tK7s-XUBlEyYNvSPxirU",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b51f6720fb820a80c",
"eyeColor": "blue",
"name": "Mcintosh Cooley",
"gender": "male",
"company": "example-company-1",
"email": "mcintoshcooley@example-company-1.com",
"phone": "+1 (993) 503-3824"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<p>Now we can either stack another match clause to get our females with blue eyes, or use the filter clause (which doesn’t affect the scoring of the document).</p>
<p>Two stacked match clauses in a must works exactly like the logical operator <code class="language-plaintext highlighter-rouge">AND</code>. So in the below query we’re saying <code class="language-plaintext highlighter-rouge">eyeColour = blue AND gender = female</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"eyeColor":"blue"
}
},
{
"match":{
"gender":"female"
}
}
]
}
}
}
</code></pre></div></div>
<p>With the following result: (note that the score for this query: 1.4508327 is different to the previous 0.4700036)</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 34,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.4508327,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 1.4508327,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<p>And now using the <code class="language-plaintext highlighter-rouge">filter</code> clause:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"bool":{
"must":[
{
"match":{
"eyeColor":"blue"
}
}
],
"filter":[
{
"term":{
"gender":"female"
}
}
]
}
}
}
</code></pre></div></div>
<p>As you can see in the results below, the score is the equivalent of just searching for blue eyes: 0.4700036</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 18,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 0.4700036,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
}
]
}
}
</code></pre></div></div>
<h3 id="wildcard">Wildcard</h3>
<p>The next useful query type is <code class="language-plaintext highlighter-rouge">wildcard query</code>.
Simply put this allows us to query using a wildcard pattern <code class="language-plaintext highlighter-rouge">*</code> which is a placeholder that matches zero or more characters.</p>
<p>The following query will allow us to search on the phone field for anyone who has the 911 area code, with anything coming before and after, allowing us to effectively search on this field.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"bool":{
"must":{
"wildcard":{
"phone":"*911*"
}
}
}
}
}
</code></pre></div></div>
<p>As you can see from the result, the relevant student is returned.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "tq7s-XUBlEyYNvSPxirV",
"_score": 1.0,
"_source": {
"id": "5fbce76b07595151f4100b57",
"eyeColor": "green",
"name": "Baxter Andrews",
"gender": "male",
"company": "example-company-3",
"email": "baxterandrews@example-company-3.com",
"phone": "+1 (911) 448-2944"
}
}
]
}
}
</code></pre></div></div>
<h3 id="should-or">Should (OR)</h3>
<p>Finally I wanted to cover <code class="language-plaintext highlighter-rouge">should</code> which works like the logical operator OR.
Similarly to the <code class="language-plaintext highlighter-rouge">match</code> (AND) clause earlier, we can stack multiple <code class="language-plaintext highlighter-rouge">should</code> clauses to create our OR queries.</p>
<p>For this example we will be expanding on our wildcard query and looking for students from two area codes: 911 or 929.
Note that we have replaced <code class="language-plaintext highlighter-rouge">must</code> with should and have both our wildcard queries wrapped up in this clause.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>POST /computer_science/_search
{
"query":{
"bool":{
"should":[
{
"wildcard":{
"phone":"*911*"
}
},
{
"wildcard":{
"phone":"*929*"
}
}
]
}
}
}
</code></pre></div></div>
<p>And below is the result, as you can see both our students from these area codes have been returned!</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"took": 48,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "computer_science",
"_type": "students",
"_id": "ta7s-XUBlEyYNvSPxirV",
"_score": 1.0,
"_source": {
"id": "5fbce76b0d94b91072f4a988",
"eyeColor": "blue",
"name": "Carole Decker",
"gender": "female",
"company": "example-company-2",
"email": "caroledecker@example-company-2.com",
"phone": "+1 (929) 421-3982"
}
},
{
"_index": "computer_science",
"_type": "students",
"_id": "tq7s-XUBlEyYNvSPxirV",
"_score": 1.0,
"_source": {
"id": "5fbce76b07595151f4100b57",
"eyeColor": "green",
"name": "Baxter Andrews",
"gender": "male",
"company": "example-company-3",
"email": "baxterandrews@example-company-3.com",
"phone": "+1 (911) 448-2944"
}
}
]
}
}
</code></pre></div></div>
<h2 id="wrap-up">Wrap up</h2>
<p>As you can see there is a lot to uncover with Elasticsearch, it’s a powerful tool with lots of use and is something every developer should be at least slightly familiar with.
This article hasn’t even scratched the surface of what we can do, however I hope it has given you an understanding of what can be accomplished and a great foundation to continue within the world of Elasticsearch.</p>
<p><a href="https://capgemini.github.io/development/elasticsearch-deeper-dive/">Elasticsearch: Deeper Dive</a> was originally published by Capgemini at <a href="https://capgemini.github.io">Capgemini Software Engineering</a> on February 26, 2021.</p>