Jekyll2021-05-19T02:31:41+00:00/feed.xmlCodebrews & BytesAndrew Bernard(drexler)OpenTelemetry: Observability in Distributed Services2021-04-06T16:08:00+00:002021-04-06T16:08:00+00:00/opentelemetry<p>In the past year, the platform team was tasked with setting up infrastructure and services to unify the company’s collection of disparate databases into distinct domain databases. These legacy databases served a wide array of business applications and ran on different RDBMSes. The overall objective was to eventually have a set of microservices each of which encapsulated a business concern, upon which teams could then build their applications and services. An interesting project. The challenge? It had to be done gradually without disruption to other teams’ applications. Until those legacy databases were retired, changes to any domain database needed to be streamed in realtime to them. Briefly, based on the initial requirements, we went with PostgreSQL to power the domain databases; Debezium to capture data changes in those databases’ Write Ahead Logs and forward them for targeted schema conversions with dedicated services via Kinesis.</p>
<p>After a couple of development iterations, to gain an insight into the performance of the service that polled the WAL, the service wrapper around Debezium was instrumented using <a href="https://micrometer.io/">Micrometer</a> with the generated metrics scraped by an external Prometheus cluster. Subsequently, a Grafana dashboard was setup to visualize these. With the removal of a key requirement of maintaining the order of data changes transmitted downstream, the need for reading the WAL also changed as domain databases now each had a dedicated outbox to commit transaction summaries to. Naturally, the question became whether Debezium would still be necessary. It was not. The team would go on to prove that out by building a replacement containerized multithreaded service that polled the <em>Outbox</em> table and sharded the changes by tenant into Kinesis. A couple of interesting things of note happened here:</p>
<ul>
<li><em>Implementation was in .Net Core</em></li>
<li><em>The instrumentation library changed to <a href="https://www.app-metrics.io/">AppMetrics</a></em></li>
<li><em>The metrics dashboard was updated to visualize the newly generated data</em></li>
</ul>
<p>That’s quite a few changes. The curious Rustacean in me wondered how the performance of this service would compare if it was written in <a href="https://www.rust-lang.org/">Rust</a> to take advantage of the concurrent primitives and performance <a href="https://tokio.rs/">Tokio</a> provides alongside <a href="https://github.com/rusoto/rusoto">Rusoto</a>. Replicating it would be fairly straightforward, however, there was no avoiding the other two changes: a new language-specific metrics library and updating the dashboards all over again. How can all this be avoided, not just in this scenario but in an much wider context of rewrites of existing services but needing to preserve the existing instrumentation? Additionally, internal deliberations about the possibility of moving away from the APM vendor to an internally hosted Elastic solution was also another real factor for consideration. Incredibly, a new open source project <a href="https://opentelemetry.io/">OpenTelemetry</a> had a lot of the answers to questions I was mulling.</p>
<h4 id="opentelemetry">OpenTelemetry</h4>
<p><a href="https://opentelemetry.io/">OpenTelemetry</a> is a <a href="https://www.cncf.io/">CNCF</a> project that defines a language-neutral specification and provides a collection of APIs, SDKs for handling observability data such as logs, metrics & traces in a vendor-agnostic manner. This project was formed from the convergence of two competing projects- OpenTracing & OpenCensus and backed by major cloud providers from Google, Microsoft, Amazon and virtually all vendors in the observability space - Splunk, Elastic, Datadog, LightStep, DynaTrace, NewRelic, Logzio, HoneyComb etc. Let us explore the benefits of adopting OpenTelemetry for existing and future greenfield projects.</p>
<ul>
<li>
<p>The <a href="https://github.com/open-telemetry/opentelemetry-specification">OpenTelemetry Specification</a>’s language neutrality allows for implementations in different languages. Currently, as of this writing, there are implementations for some of the most widely used general purpose languages provided by OpenTelemetry’s SIGs - Special Interest Groups: C++, .Net, Java, Javascript, Python, Go, Rust, Erlang, Ruby, PHP, Swift. These are a dedicated groups of contributors with a focus on a single language implementation. If there’s a software project using a language that is unsupported currently, chances are it will be supported in future. All this means a greater degree of flexibility when implementing software components; regardless of the language choice, instrumentation will be the same.</p>
</li>
<li>
<p><a href="https://opentelemetry.io/">OpenTelemetry</a>’s extensible architecture means that library/plugin authors can instrument their code with the API and when these artifacts are utilized in a service or application implementing the OpenTelemetry SDK, there is visibility into both the service code and third party libraries performance. Microsoft’s <a href="https://dapr.io/">Distributed Application Runtime</a> library is an example. There are plugins for popular frameworks like Spring, Express etc.</p>
</li>
<li>
<p><a href="https://opentelemetry.io/">OpenTelemetry</a> prevents vendor lock-in. The OpenTelemetry <a href="https://opentelemetry.io/docs/collector/">Collector</a> allows for the reception, processing and exportation of telemetry data with support for different open source wire formats - Jaeger, Prometheus, Fluent Bit, W3C TraceContext format, Zipkin’s B3 headers etc. More so, with implementation of <a href="https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter">Exporters</a> for different telemetry backends, switching between vendors is a breeze. For example, one can pipe their tracing data to NewRelic, Elastic, a deployed instance of Zipkin etc…and it is all a simple configuration change on the <a href="https://opentelemetry.io/docs/collector/">Collector</a>. Think of it as instrumentation as a form of abstraction, where the destination backends for the telemetry data is abstracted away from the application/service.</p>
<p><img src="/assets/imgs/otel-collector.png" alt="OpenTelemetry Collector" /></p>
</li>
<li>
<p>With the stabilization of the <a href="https://medium.com/opentelemetry/opentelemetry-specification-v1-0-0-tracing-edition-72dd08936978">Tracing Specification</a> and the outline of the <a href="https://medium.com/opentelemetry/opentelemetry-metrics-roadmap-f4276fd070cf">Metrics Roadmap</a>, OpenTelemetry is shaping up to be the way to derive insights into distributed services. One library to trace, generate metrics and connect them to other telemetry data. Since it is also the <a href="https://www.cncf.io/">CNCF</a> project to replace OpenTracing & OpenCensus, for service meshes like <a href="https://linkerd.io/">Linkerd</a>, <a href="https://opentelemetry.io/">OpenTelemetry</a> will eventually become the de-facto way of propagating telemetry data from the various services. This means an easier transition when moving from a collection of microservices to a service mesh when the complexity warrants it.</p>
</li>
</ul>
<h4 id="demo">Demo</h4>
<p><img src="/assets/imgs/otel-demo-tracing.png" alt="OpenTelemetry Collector" /></p>
<p>For a quick demonstration of the tracing capabilities, I have a <a href="https://github.com/drexler/opentelemetry-demo-tracing">demo</a> built to showcase:</p>
<ul>
<li>support for multiple languages - chose Rust, Typescript with ExpressJS and .Net 5</li>
<li>tracing across different communication protocols - HTTP & gRPC</li>
<li>auto-instrumentation of services</li>
<li>manual instrumentation of services</li>
</ul>
<p>Rust is intentionally used for two services - <em>Employee</em> & <em>Direct Deposits</em> to demonstrate manual instrumentation with synchronous and asynchronous functions as the data layer each service works with offered a sync API, in the case of <a href="http://diesel.rs/">Diesel</a> with PostgreSQL and an async API via <a href="https://www.mongodb.com/2">MongoDB</a>’s Rust 2.0-alpha driver. Typescript, comes along for the ride since it’s one of the languages I use server side. .Net5 is latest language iteration from Microsoft, so I had a side interest in taking a peek at its comparative performance to Rust <a href="https://drexler.github.io/aws-lambda-rust/">again</a>.</p>
<h5 id="instrumentation-traces-events--tags">Instrumentation, Traces, Events & Tags</h5>
<p>With OpenTelemetry, one can auto-instrument code and/or apply manual instrumentation. This gives flexibility when working with legacy codebases or starting greenfield projects by allowing teams to auto-instrument applications first and then for deeper insights into areas of the code that might be of interest later on, apply the instrumentation manually.</p>
<p>From the demo code, I instrumented the .Net5-based <em>Paycheck</em> service as follows:</p>
<figure class="highlight"><pre><code class="language-csharp" data-lang="csharp"><span class="k">public</span> <span class="k">static</span> <span class="k">void</span> <span class="nf">ConfigureTracing</span><span class="p">(</span><span class="n">IServiceCollection</span> <span class="n">services</span><span class="p">)</span>
<span class="p">{</span>
<span class="c1">// Necessary for OpenTelemetry Collector communication since traffic is unencrypted for demo purposes</span>
<span class="n">AppContext</span><span class="p">.</span><span class="nf">SetSwitch</span><span class="p">(</span><span class="s">"System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport"</span><span class="p">,</span> <span class="k">true</span><span class="p">);</span>
<span class="n">services</span><span class="p">.</span><span class="nf">AddOpenTelemetryTracing</span><span class="p">((</span><span class="n">builder</span><span class="p">)</span> <span class="p">=></span> <span class="n">builder</span>
<span class="p">.</span><span class="nf">AddSource</span><span class="p">(</span><span class="s">"paycheck-db-conn"</span><span class="p">)</span>
<span class="p">.</span><span class="nf">AddAspNetCoreInstrumentation</span><span class="p">()</span>
<span class="p">.</span><span class="nf">SetResourceBuilder</span><span class="p">(</span><span class="n">ResourceBuilder</span><span class="p">.</span><span class="nf">CreateDefault</span><span class="p">().</span><span class="nf">AddService</span><span class="p">(</span><span class="s">"paycheck-service"</span><span class="p">))</span>
<span class="p">.</span><span class="nf">AddOtlpExporter</span><span class="p">(</span><span class="n">options</span> <span class="p">=></span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">otelCollectorUri</span> <span class="p">=</span> <span class="n">Environment</span><span class="p">.</span><span class="nf">GetEnvironmentVariable</span><span class="p">(</span><span class="s">"OTEL_COLLECTOR_URI"</span><span class="p">)</span> <span class="p">??</span> <span class="s">"http://localhost:4317"</span><span class="p">;</span>
<span class="n">options</span><span class="p">.</span><span class="n">ExportProcessorType</span> <span class="p">=</span> <span class="n">ExportProcessorType</span><span class="p">.</span><span class="n">Batch</span><span class="p">;</span>
<span class="n">options</span><span class="p">.</span><span class="n">Endpoint</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">Uri</span><span class="p">(</span><span class="n">otelCollectorUri</span><span class="p">);</span>
<span class="p">}));</span>
<span class="p">}</span></code></pre></figure>
<p>Here, under the hood, OpenTelemetry for DotNet sets up the auto-instrumentation via <em>AddAspNetCoreInstrumentation()</em> and uses the <em>System.Diagnostics.ActivitySource</em> to setup a custom event sink named <em>paycheck-db-conn</em> for handling the manual instrumentation later as seen below:</p>
<figure class="highlight"><pre><code class="language-csharp" data-lang="csharp"><span class="k">public</span> <span class="k">class</span> <span class="nc">PayRepository</span> <span class="p">:</span> <span class="n">BaseRepository</span><span class="p"><</span><span class="n">Pay</span><span class="p">>,</span> <span class="n">IPayRepository</span>
<span class="p">{</span>
<span class="k">private</span> <span class="k">readonly</span> <span class="n">ActivitySource</span> <span class="n">_activitySource</span><span class="p">;</span>
<span class="k">public</span> <span class="nf">PayRepository</span><span class="p">(</span>
<span class="n">IMongoClient</span> <span class="n">mongoClient</span><span class="p">,</span>
<span class="n">IClientSessionHandle</span> <span class="n">clientSessionHandle</span><span class="p">)</span> <span class="p">:</span> <span class="k">base</span><span class="p">(</span><span class="n">mongoClient</span><span class="p">,</span> <span class="n">clientSessionHandle</span><span class="p">,</span> <span class="s">"paychecks"</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">_activitySource</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">ActivitySource</span><span class="p">(</span><span class="s">"paycheck-db-conn"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">public</span> <span class="k">async</span> <span class="n">Task</span><span class="p"><</span><span class="n">IEnumerable</span><span class="p"><</span><span class="n">Pay</span><span class="p">>></span> <span class="nf">GetAllPaychecksAsync</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">var</span> <span class="n">span</span> <span class="p">=</span> <span class="n">_activitySource</span><span class="p">.</span><span class="nf">StartActivity</span><span class="p">(</span><span class="s">"GetAllPaychecksAsync"</span><span class="p">);</span>
<span class="n">List</span><span class="p"><</span><span class="n">Pay</span><span class="p">></span> <span class="n">result</span> <span class="p">=</span> <span class="k">null</span><span class="p">;</span>
<span class="k">try</span>
<span class="p">{</span>
<span class="n">result</span> <span class="p">=</span> <span class="k">await</span> <span class="n">Collection</span><span class="p">.</span><span class="nf">AsQueryable</span><span class="p">().</span><span class="nf">ToListAsync</span><span class="p">();</span>
<span class="n">span</span><span class="p">.</span><span class="nf">AddTag</span><span class="p">(</span><span class="s">"paychecks.count"</span><span class="p">,</span> <span class="n">result</span><span class="p">.</span><span class="n">Count</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">catch</span> <span class="p">(</span><span class="n">Exception</span> <span class="n">ex</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">span</span><span class="p">.</span><span class="nf">AddEvent</span><span class="p">(</span><span class="k">new</span> <span class="nf">ActivityEvent</span><span class="p">(</span><span class="s">$"Call Failure. Reason: </span><span class="p">{</span><span class="n">ex</span><span class="p">.</span><span class="n">Message</span><span class="p">}</span><span class="s">"</span><span class="p">));</span>
<span class="k">throw</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">finally</span>
<span class="p">{</span>
<span class="n">span</span><span class="p">.</span><span class="nf">Stop</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">result</span><span class="p">;</span>
<span class="p">}</span></code></pre></figure>
<p>A trace is simply a collection of spans. In the above, we start a <em>child</em> span of the parent span and annotate it appropriate via tags & events. Events represent occurences that happened at a specific time during a span’s workload. Together, the additional metadata drives quick insights when investigating problems. For example suppose we got the following api response on an attempt to load all paychecks:</p>
<figure class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span><span class="w">
</span><span class="nl">"statusCode"</span><span class="p">:</span><span class="w"> </span><span class="mi">500</span><span class="p">,</span><span class="w">
</span><span class="nl">"message"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Server ist kaput!"</span><span class="p">,</span><span class="w">
</span><span class="nl">"developerMessage"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Internal Error"</span><span class="p">,</span><span class="w">
</span><span class="nl">"requestId"</span><span class="p">:</span><span class="w"> </span><span class="s2">"451b8025562676951540a00cc121af04"</span><span class="w">
</span><span class="p">}</span></code></pre></figure>
<p>With the requestId (aka the global traceId) above, we see how the additional metadata which we applied to the span helps us to understand a request error as the request is propagated across different service and network boundaries.</p>
<p><img src="/assets/imgs/trace-error.png" alt="Trace Overview" /></p>
<p><img src="/assets/imgs/db-down.png" alt="Database Trace Details" /></p>
<p>In this case, a database connectivity problem!</p>
<p>We apply a similar pattern to both Rust-based services: <em>Employee</em> & <em>Direct Deposit</em>. Below, we pull the propagated trace context and use it to further build child spans around database calls.</p>
<figure class="highlight"><pre><code class="language-rust" data-lang="rust"><span class="nd">#[tonic::async_trait]</span>
<span class="k">impl</span> <span class="n">EmployeeService</span> <span class="k">for</span> <span class="n">MyEmployeeService</span> <span class="p">{</span>
<span class="k">async</span> <span class="k">fn</span> <span class="nf">get_all_employees</span><span class="p">(</span>
<span class="o">&</span><span class="k">self</span><span class="p">,</span>
<span class="n">request</span><span class="p">:</span> <span class="n">Request</span><span class="o"><</span><span class="p">()</span><span class="o">></span><span class="p">,</span>
<span class="p">)</span> <span class="k">-></span> <span class="n">Result</span><span class="o"><</span><span class="n">Response</span><span class="o"><</span><span class="n">GetAllEmployeesResponse</span><span class="o">></span><span class="p">,</span> <span class="n">Status</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">parent_ctx</span> <span class="o">=</span> <span class="nn">tracing</span><span class="p">::</span><span class="nf">get_parent_context</span><span class="p">(</span><span class="o">&</span><span class="n">request</span><span class="p">);</span>
<span class="k">let</span> <span class="n">tracer</span> <span class="o">=</span> <span class="nn">global</span><span class="p">::</span><span class="nf">tracer</span><span class="p">(</span><span class="s">"employee-service"</span><span class="p">);</span>
<span class="k">let</span> <span class="n">span</span> <span class="o">=</span> <span class="n">tracer</span><span class="nf">.start_with_context</span><span class="p">(</span><span class="s">"get_all_employees"</span><span class="p">,</span> <span class="n">parent_ctx</span><span class="p">);</span>
<span class="k">let</span> <span class="n">db_result</span> <span class="o">=</span> <span class="n">tracer</span><span class="nf">.with_span</span><span class="p">(</span><span class="n">span</span><span class="p">,</span> <span class="p">|</span><span class="mi">_</span><span class="n">cx</span><span class="p">|</span> <span class="k">-></span> <span class="n">Result</span><span class="o"><</span><span class="nb">Vec</span><span class="o"><</span><span class="n">Employee</span><span class="o">></span><span class="p">,</span> <span class="nn">error</span><span class="p">::</span><span class="n">Error</span><span class="o">></span> <span class="p">{</span>
<span class="k">let</span> <span class="n">db_client</span> <span class="o">=</span> <span class="nn">EmployeeDb</span><span class="p">::</span><span class="nf">initialize</span><span class="p">()</span><span class="o">?</span><span class="p">;</span>
<span class="n">db_client</span>
<span class="nf">.get_employees</span><span class="p">()</span>
<span class="nf">.map</span><span class="p">(|</span><span class="n">employees</span><span class="p">|</span> <span class="n">employees</span><span class="nf">.into_iter</span><span class="p">()</span><span class="nf">.map</span><span class="p">(</span><span class="n">model_mapper</span><span class="p">)</span><span class="nf">.collect</span><span class="p">())</span>
<span class="p">});</span>
<span class="k">match</span> <span class="n">db_result</span> <span class="p">{</span>
<span class="nf">Ok</span><span class="p">(</span><span class="n">employees</span><span class="p">)</span> <span class="k">=></span> <span class="nf">Ok</span><span class="p">(</span><span class="nn">Response</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">GetAllEmployeesResponse</span> <span class="p">{</span> <span class="n">employees</span> <span class="p">})),</span>
<span class="nf">Err</span><span class="p">(</span><span class="mi">_</span><span class="p">)</span> <span class="k">=></span> <span class="nf">Err</span><span class="p">(</span><span class="nn">Status</span><span class="p">::</span><span class="nf">unknown</span><span class="p">(</span><span class="s">"unable to load all employees"</span><span class="p">)),</span>
<span class="p">}</span>
<span class="p">}</span></code></pre></figure>
<p>Similarly, for the <em>Payroll</em> service:</p>
<figure class="highlight"><pre><code class="language-typescript" data-lang="typescript"><span class="cm">/**
* Gets an employee's paychecks
*/</span>
<span class="nx">employeesRouter</span><span class="p">.</span><span class="kd">get</span><span class="p">(</span><span class="dl">'</span><span class="s1">/:employee_id/paychecks</span><span class="dl">'</span><span class="p">,</span> <span class="p">(</span><span class="nx">request</span><span class="p">:</span> <span class="nx">Request</span><span class="p">,</span> <span class="nx">response</span><span class="p">:</span> <span class="nx">Response</span><span class="p">,</span> <span class="nx">next</span><span class="p">:</span> <span class="nx">NextFunction</span><span class="p">)</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">span</span> <span class="o">=</span> <span class="nx">tracer</span><span class="p">.</span><span class="nx">startSpan</span><span class="p">(</span><span class="dl">'</span><span class="s1">payroll: getEmployeePaychecks</span><span class="dl">'</span><span class="p">);</span>
<span class="kd">const</span> <span class="nx">employeeId</span> <span class="o">=</span> <span class="nx">request</span><span class="p">.</span><span class="nx">params</span><span class="p">.</span><span class="nx">employee_id</span><span class="p">;</span>
<span class="nx">api</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="kd">with</span><span class="p">(</span><span class="nx">api</span><span class="p">.</span><span class="nx">setSpan</span><span class="p">(</span><span class="nx">api</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="nx">active</span><span class="p">(),</span> <span class="nx">span</span><span class="p">),</span> <span class="k">async</span><span class="p">()</span> <span class="o">=></span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">traceId</span> <span class="o">=</span> <span class="nx">span</span><span class="p">.</span><span class="nx">context</span><span class="p">().</span><span class="nx">traceId</span><span class="p">;</span>
<span class="k">try</span> <span class="p">{</span>
<span class="kd">const</span> <span class="nx">results</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">paycheckService</span><span class="p">.</span><span class="nx">getEmployeePaychecks</span><span class="p">({</span><span class="na">employee_id</span><span class="p">:</span> <span class="nx">employeeId</span><span class="p">});</span>
<span class="nx">response</span><span class="p">.</span><span class="nx">send</span><span class="p">(</span><span class="nx">formatResponse</span><span class="p">(</span><span class="nx">results</span><span class="p">.</span><span class="nx">paychecks</span><span class="p">));</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">(</span><span class="nx">err</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">next</span><span class="p">(</span><span class="nx">createError</span><span class="p">(...[</span><span class="nx">convertGrpcToHttpErrorCode</span><span class="p">(</span><span class="nx">err</span><span class="p">)],</span> <span class="p">{</span>
<span class="na">developerMessage</span><span class="p">:</span> <span class="nx">getGrpcErrorMessage</span><span class="p">(</span><span class="nx">err</span><span class="p">.</span><span class="nx">message</span><span class="p">),</span>
<span class="nx">traceId</span>
<span class="p">}));</span>
<span class="p">}</span> <span class="k">finally</span> <span class="p">{</span>
<span class="nx">span</span><span class="p">.</span><span class="nx">end</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">});</span>
<span class="p">});</span></code></pre></figure>
<h5 id="collector-deployment-pipelines--telemetry-backend-configuration">Collector Deployment, Pipelines & Telemetry-backend Configuration</h5>
<p>There are two modes of deploying the OpenTelemetry Collector:</p>
<ul>
<li>agent: where the collector instance is running on the same host as client application</li>
<li>gateway: where one or more collector instances run as a standalone service.</li>
</ul>
<p>The Collector serves as centralized point to allow for the configuration of telemetry data reception, processing and exportation to desired telemetry backends. In our demo configuration as shown below, we pushed trace data in <em>batches</em> that was <em>received</em> via HTTP & gRPC, through a trace <em>pipeline</em> setup to <em>export</em> the data to the <a href="https://www.jaegertracing.io/">Jaeger</a> and <a href="https://zipkin.io/">Zipkin</a> instances we deployed for analysis. Although commented out, the traces could have been sent to NewRelic as well. There are many supported vendors with the Collector. This makes it easy to compare the depth of analytics that various vendors can provide given the same telemetry data.</p>
<figure class="highlight"><pre><code class="language-yaml" data-lang="yaml"><span class="na">receivers</span><span class="pi">:</span>
<span class="na">otlp</span><span class="pi">:</span>
<span class="na">protocols</span><span class="pi">:</span>
<span class="na">grpc</span><span class="pi">:</span>
<span class="na">http</span><span class="pi">:</span>
<span class="na">cors_allowed_origins</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">http://*</span>
<span class="pi">-</span> <span class="s">https://*</span>
<span class="na">exporters</span><span class="pi">:</span>
<span class="na">jaeger</span><span class="pi">:</span>
<span class="na">endpoint</span><span class="pi">:</span> <span class="s">jaeger-all-in-one:14250</span>
<span class="na">insecure</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">zipkin</span><span class="pi">:</span>
<span class="na">endpoint</span><span class="pi">:</span> <span class="s2">"</span><span class="s">http://zipkin-all-in-one:9411/api/v2/spans"</span>
<span class="c1"># newrelic:</span>
<span class="c1"># apikey: <<NEW_RELIC_INSIGHTS_KEY>></span>
<span class="c1"># timeout: 30s</span>
<span class="na">processors</span><span class="pi">:</span>
<span class="na">batch</span><span class="pi">:</span>
<span class="na">service</span><span class="pi">:</span>
<span class="na">pipelines</span><span class="pi">:</span>
<span class="na">traces</span><span class="pi">:</span>
<span class="na">receivers</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">otlp</span><span class="pi">]</span>
<span class="na">processors</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">batch</span><span class="pi">]</span>
<span class="na">exporters</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">jaeger</span><span class="pi">,</span> <span class="nv">zipkin</span><span class="pi">]</span>
<span class="c1">#exporters: [zipkin, jaeger, newrelic]</span></code></pre></figure>
<p>Through pipelines, metrics and eventually logs data will be exported through the collector thus making OpenTelemetry the one library for applications’ observability needs. Additionally, with the Collector pipelines, one can ship various telemetry data to different vendors. If desired, one can ship their metrics data to a vendor like Splunk and still use DataDog for traces and logging. OpenTelemetry opens up the possibilities…</p>
<h5 id="links">Links</h5>
<p>The demo code for the above can be found <a href="https://github.com/drexler/opentelemetry-demo-tracing">here</a></p>dReXlerIn the past year, the platform team was tasked with setting up infrastructure and services to unify the company’s collection of disparate databases into distinct domain databases. These legacy databases served a wide array of business applications and ran on different RDBMSes. The overall objective was to eventually have a set of microservices each of which encapsulated a business concern, upon which teams could then build their applications and services. An interesting project. The challenge? It had to be done gradually without disruption to other teams’ applications. Until those legacy databases were retired, changes to any domain database needed to be streamed in realtime to them. Briefly, based on the initial requirements, we went with PostgreSQL to power the domain databases; Debezium to capture data changes in those databases’ Write Ahead Logs and forward them for targeted schema conversions with dedicated services via Kinesis. After a couple of development iterations, to gain an insight into the performance of the service that polled the WAL, the service wrapper around Debezium was instrumented using Micrometer with the generated metrics scraped by an external Prometheus cluster. Subsequently, a Grafana dashboard was setup to visualize these. With the removal of a key requirement of maintaining the order of data changes transmitted downstream, the need for reading the WAL also changed as domain databases now each had a dedicated outbox to commit transaction summaries to. Naturally, the question became whether Debezium would still be necessary. It was not. The team would go on to prove that out by building a replacement containerized multithreaded service that polled the Outbox table and sharded the changes by tenant into Kinesis. A couple of interesting things of note happened here: Implementation was in .Net Core The instrumentation library changed to AppMetrics The metrics dashboard was updated to visualize the newly generated data That’s quite a few changes. The curious Rustacean in me wondered how the performance of this service would compare if it was written in Rust to take advantage of the concurrent primitives and performance Tokio provides alongside Rusoto. Replicating it would be fairly straightforward, however, there was no avoiding the other two changes: a new language-specific metrics library and updating the dashboards all over again. How can all this be avoided, not just in this scenario but in an much wider context of rewrites of existing services but needing to preserve the existing instrumentation? Additionally, internal deliberations about the possibility of moving away from the APM vendor to an internally hosted Elastic solution was also another real factor for consideration. Incredibly, a new open source project OpenTelemetry had a lot of the answers to questions I was mulling. OpenTelemetry OpenTelemetry is a CNCF project that defines a language-neutral specification and provides a collection of APIs, SDKs for handling observability data such as logs, metrics & traces in a vendor-agnostic manner. This project was formed from the convergence of two competing projects- OpenTracing & OpenCensus and backed by major cloud providers from Google, Microsoft, Amazon and virtually all vendors in the observability space - Splunk, Elastic, Datadog, LightStep, DynaTrace, NewRelic, Logzio, HoneyComb etc. Let us explore the benefits of adopting OpenTelemetry for existing and future greenfield projects. The OpenTelemetry Specification’s language neutrality allows for implementations in different languages. Currently, as of this writing, there are implementations for some of the most widely used general purpose languages provided by OpenTelemetry’s SIGs - Special Interest Groups: C++, .Net, Java, Javascript, Python, Go, Rust, Erlang, Ruby, PHP, Swift. These are a dedicated groups of contributors with a focus on a single language implementation. If there’s a software project using a language that is unsupported currently, chances are it will be supported in future. All this means a greater degree of flexibility when implementing software components; regardless of the language choice, instrumentation will be the same. OpenTelemetry’s extensible architecture means that library/plugin authors can instrument their code with the API and when these artifacts are utilized in a service or application implementing the OpenTelemetry SDK, there is visibility into both the service code and third party libraries performance. Microsoft’s Distributed Application Runtime library is an example. There are plugins for popular frameworks like Spring, Express etc. OpenTelemetry prevents vendor lock-in. The OpenTelemetry Collector allows for the reception, processing and exportation of telemetry data with support for different open source wire formats - Jaeger, Prometheus, Fluent Bit, W3C TraceContext format, Zipkin’s B3 headers etc. More so, with implementation of Exporters for different telemetry backends, switching between vendors is a breeze. For example, one can pipe their tracing data to NewRelic, Elastic, a deployed instance of Zipkin etc…and it is all a simple configuration change on the Collector. Think of it as instrumentation as a form of abstraction, where the destination backends for the telemetry data is abstracted away from the application/service. With the stabilization of the Tracing Specification and the outline of the Metrics Roadmap, OpenTelemetry is shaping up to be the way to derive insights into distributed services. One library to trace, generate metrics and connect them to other telemetry data. Since it is also the CNCF project to replace OpenTracing & OpenCensus, for service meshes like Linkerd, OpenTelemetry will eventually become the de-facto way of propagating telemetry data from the various services. This means an easier transition when moving from a collection of microservices to a service mesh when the complexity warrants it. DemoGit reflog to the rescue!2020-07-29T16:08:00+00:002020-07-29T16:08:00+00:00/reflog-rescue<p>The standard development workflow utilized by my team requires rebasing of feature branches on the master branch prior to issuing a pull request and the subsequent merge. This allows us to ensure that commits are always <em>in order</em> after a merge rather than how they are arranged when a merge strategy is employed. There are pros and cons to each approach. However, having been a heavy Git user for years I tend to favor the rebase approach. So one can imagine when for a hairy 10 minutes i lost my work entirely on a new feature branch. Pull from remote? That was useless since I had done a rebase previously and force-pushed the local branch up to the remote branch thus wiping everything there. Here’s how I got into such a state.</p>
<p>Here’s how I got into such a state. The repository was new and had originally been created without any file. So on pushing up my feature branch for the initial review, it became the default branch. To correct this, the idea then was to create a <em>master</em> branch from the feature branch and delete all the commits with the exception of a README file it contained. Thereafter, the feature branch will then be rebased on master as always and the PR issued against it. What happened on the rebase was that Git seeing that all the commits were deleted essentially resolved the rebase automatically by deleting the commits I had in the feature branch. Rebase re-writes history! Being the same commit hash ids, I was left with nothing. Immediately pushing to remote after the rebase compounded the issue.</p>
<p>With <code class="highlighter-rouge">git log -2 --oneline</code> returning nothing how was I to recover days of work. Git Reflog! I knew about it but had never had the occasion to use it. For anyone reading this, invest your time in and master advanced Git commands and especially that. It’s a life(hair?) saver! Below were my commands:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ git reflog
$ git reset HEAD@{number} //commit before rebase
$ git status
$ git reset --hard
</code></pre></div></div>
<p>To summarize the above, I needed to get my local branch back to the state it was in before I rebased it. So the first two commands did that.
Next, to be sure i had the correct files staged - <code class="highlighter-rouge">git status</code>. Seeing all the deletions there, the obvious thing was to discard those changes thus the <code class="highlighter-rouge">git reset --hard</code>.</p>
<p>With work now restored, it was easy to push it back to remote and end on good note.</p>dReXlerThe standard development workflow utilized by my team requires rebasing of feature branches on the master branch prior to issuing a pull request and the subsequent merge. This allows us to ensure that commits are always in order after a merge rather than how they are arranged when a merge strategy is employed. There are pros and cons to each approach. However, having been a heavy Git user for years I tend to favor the rebase approach. So one can imagine when for a hairy 10 minutes i lost my work entirely on a new feature branch. Pull from remote? That was useless since I had done a rebase previously and force-pushed the local branch up to the remote branch thus wiping everything there. Here’s how I got into such a state. Here’s how I got into such a state. The repository was new and had originally been created without any file. So on pushing up my feature branch for the initial review, it became the default branch. To correct this, the idea then was to create a master branch from the feature branch and delete all the commits with the exception of a README file it contained. Thereafter, the feature branch will then be rebased on master as always and the PR issued against it. What happened on the rebase was that Git seeing that all the commits were deleted essentially resolved the rebase automatically by deleting the commits I had in the feature branch. Rebase re-writes history! Being the same commit hash ids, I was left with nothing. Immediately pushing to remote after the rebase compounded the issue. With git log -2 --oneline returning nothing how was I to recover days of work. Git Reflog! I knew about it but had never had the occasion to use it. For anyone reading this, invest your time in and master advanced Git commands and especially that. It’s a life(hair?) saver! Below were my commands: $ git reflog $ git reset HEAD@{number} //commit before rebase $ git status $ git reset --hard To summarize the above, I needed to get my local branch back to the state it was in before I rebased it. So the first two commands did that. Next, to be sure i had the correct files staged - git status. Seeing all the deletions there, the obvious thing was to discard those changes thus the git reset --hard. With work now restored, it was easy to push it back to remote and end on good note.AWS Lambda & Rust2020-04-03T16:08:00+00:002020-04-03T16:08:00+00:00/aws-lambda-rust<h4 id="background">Background</h4>
<p>
The HR product my previous team inherited and migrated from Azure to AWS was built using ASP.Net in VB.Net. As one can imagine, this legacy application although particularly useful is woefully inadequate when
the modern alternatives such as single-page applications offer a smoother user experience. In order to modernize it, distinct functional parts of the application were to be
re-written with ReactJS and the resulting bundle served out from a Cloudfront-distributed S3 bucket to the application on page loads. At Asure, all new development is <i>Cloud</i>-first. The
earliest module re-written this way was <i>Direct Deposits</i> whose backend was a series of lambdas utilizing node-mssql to interact with an RDS datastore.
</p>
<p>
Each tenant had a series of stored encrypted credentials that needed to be decoded for further calls into the other internal applications that
linked the HR application to the Payroll suite. These were decoded on the fly. To replicate this, the original VB.Net code was ported into a utility lambda in .Net Core (2.1)
from which other <i>Direct Deposit</i>-related lambdas could call into. Ideally, having this a lambda layer would have been nice but with the different runtimes
involved - Node & .Net Core, that was ruled out.
</p>
<h4 id="problem">Problem</h4>
<p>With production workloads, each lambda needing to create/update/delete an existing direct deposit needed to await the result of the call to decrypt the necesary credentials.
The associated cold start with the .Net Core-based decryptor lambda became a bottleneck to other lambdas and overall had a noticeable impact on the user experience. Here’s
an image the latency involved:</p>
<p><img src="/assets/imgs/decrypt-lambda-unoptimized.png" alt="decrypt-lambda-unoptimized" /></p>
<h4 id="solution">Solution</h4>
<p>There were a few possible solutions:</p>
<ul>
<li>Find supporting Node libraries and fold the existing decryption functionality into the various services.</li>
<li>Rewrite the decryption lambda in a different runtime.</li>
</ul>
<p>The first option looked promising, however, the nuances between this specific implementation and Node’s were a bit troublesome. The risk to breaking the existing services were also a factor. The second was limited in scope and with the alternatives available: Go & Rust, there was an opportunity to investigate how these languages could be leveraged to meet the performance constraints we sought as well as expand the <em>tools</em> available to the team when it comes to performance-related problems. Since Rust is not garbage-collected and offers near native-C style performance, that won out. Admittedly, i am biased when it comes to Rust.</p>
<p>Utilizing <a href="https://github.com/rusoto/rusoto">Rusoto</a>, the <a href="https://github.com/softprops/serverless-rust">Serverless-Rust</a> plugin, i canaried the Rust-equivalent version of the decryptor service. This was a non-optimized version with
the following traits:</p>
<ul>
<li>Non-architecture specific build</li>
<li>Skipped prewarming of the service investigate cold-start effects</li>
<li>Used synchronous IO-blocking version of <a href="https://github.com/rusoto/rusoto">Rusoto</a>, version <a href="https://docs.rs/crate/rusoto_core/0.42.0">0.42</a>.</li>
<li>OpenSSL instead of Rust-TLS</li>
</ul>
<p>The result: <strong>7X improvement on cold starts!</strong></p>
<p><img src="/assets/imgs/rust-decrypt-lambda-unoptimized.png" alt="rust-decrypt-lambda" /></p>
<h4 id="what-about-net-core-31">What about .Net Core 3.1?</h4>
<p>So a few days back, AWS started providing support for .Net 3.1. Would upgrading to that help with the overall cold start improvement with the decrypt service. I did rummage around with that but although there’s a noticeable improvement in the overall cold starts the .Net Core 3.1-based lambda, the unoptimized Rust version still pips it at the post. Here’s a summary of my findings. Note, this is not perfect benchmark but rather a focused use case analysis.</p>
<table>
<thead>
<tr>
<th>Lambda Runtime</th>
<th>Container Type</th>
<th>Avg Cold Start Time(ms)</th>
<th>Avg Duration(ms)</th>
<th>Avg Memory/Invocation(MB)</th>
</tr>
</thead>
<tbody>
<tr>
<td>.Net Core 2.1</td>
<td>AmazonLinux 2</td>
<td>4819</td>
<td>115</td>
<td>94</td>
</tr>
<tr>
<td>Custom (Rust)</td>
<td>AmazonLinux</td>
<td>283</td>
<td>232</td>
<td>39</td>
</tr>
<tr>
<td>.Net Core 3.1</td>
<td>AmazonLinux 2</td>
<td>3549</td>
<td>94</td>
<td>111</td>
</tr>
<tr>
<td>Custom (Rust**)</td>
<td>AmazonLinux</td>
<td>203</td>
<td>76</td>
<td>34</td>
</tr>
</tbody>
</table>
<p>Rust** : <em>Partially optimized (Rust TLS for Rusoto SDK, targeted architecture: x86_64-unknown-linux-musl)</em></p>dReXlerBackground The HR product my previous team inherited and migrated from Azure to AWS was built using ASP.Net in VB.Net. As one can imagine, this legacy application although particularly useful is woefully inadequate when the modern alternatives such as single-page applications offer a smoother user experience. In order to modernize it, distinct functional parts of the application were to be re-written with ReactJS and the resulting bundle served out from a Cloudfront-distributed S3 bucket to the application on page loads. At Asure, all new development is Cloud-first. The earliest module re-written this way was Direct Deposits whose backend was a series of lambdas utilizing node-mssql to interact with an RDS datastore. Each tenant had a series of stored encrypted credentials that needed to be decoded for further calls into the other internal applications that linked the HR application to the Payroll suite. These were decoded on the fly. To replicate this, the original VB.Net code was ported into a utility lambda in .Net Core (2.1) from which other Direct Deposit-related lambdas could call into. Ideally, having this a lambda layer would have been nice but with the different runtimes involved - Node & .Net Core, that was ruled out. Problem With production workloads, each lambda needing to create/update/delete an existing direct deposit needed to await the result of the call to decrypt the necesary credentials. The associated cold start with the .Net Core-based decryptor lambda became a bottleneck to other lambdas and overall had a noticeable impact on the user experience. Here’s an image the latency involved: Solution There were a few possible solutions: Find supporting Node libraries and fold the existing decryption functionality into the various services. Rewrite the decryption lambda in a different runtime. The first option looked promising, however, the nuances between this specific implementation and Node’s were a bit troublesome. The risk to breaking the existing services were also a factor. The second was limited in scope and with the alternatives available: Go & Rust, there was an opportunity to investigate how these languages could be leveraged to meet the performance constraints we sought as well as expand the tools available to the team when it comes to performance-related problems. Since Rust is not garbage-collected and offers near native-C style performance, that won out. Admittedly, i am biased when it comes to Rust. Utilizing Rusoto, the Serverless-Rust plugin, i canaried the Rust-equivalent version of the decryptor service. This was a non-optimized version with the following traits: Non-architecture specific build Skipped prewarming of the service investigate cold-start effects Used synchronous IO-blocking version of Rusoto, version 0.42. OpenSSL instead of Rust-TLS The result: 7X improvement on cold starts!Moving from Blogger to Github Pages2020-03-13T16:08:00+00:002020-03-13T16:08:00+00:00/new-decade-new-platformSo it's been two years since my last update and I've generally been too preoccupied(lazy?) to update this blog. I started on Blogger back in 2009 to showcase
a side project employing Java and good ol' Adobe Air when using Flash/Flex was all the rage. Sidenote: does anyone use GWT these days? Over time, as I've invested my efforts
mostly on Github projects - both personal and open source-related - it became apparent that consolidating both my blog and other projects on a common platform
was the way forward. I wanted a minimalist-styled blog without too much effort; building one in React was under consideration since it's a library i work with.
My inner geek still wanted the barebones and thus Jekyll & Github Pages won out. The migration effort was fairly straightforward.
<br/>
<p>
Adding a new blog post simply involves the following:
<ul>
<li> adding an entry to _posts folder</li>
<li> serving up the files to test locally: <b>$ bundle exec jekyll serve --trace </b></li>
<li> testing the rendered html for correctness:<b>$ bundle exec htmlproofer ./_site --disable-external</b> </li>
<li> pushing & committing like any other cod</li>
</ul>
</p>
<br/>
EASY PEASY LEMON SQUEEZY!dReXlerSo it's been two years since my last update and I've generally been too preoccupied(lazy?) to update this blog. I started on Blogger back in 2009 to showcase a side project employing Java and good ol' Adobe Air when using Flash/Flex was all the rage. Sidenote: does anyone use GWT these days? Over time, as I've invested my efforts mostly on Github projects - both personal and open source-related - it became apparent that consolidating both my blog and other projects on a common platform was the way forward. I wanted a minimalist-styled blog without too much effort; building one in React was under consideration since it's a library i work with. My inner geek still wanted the barebones and thus Jekyll & Github Pages won out. The migration effort was fairly straightforward. Adding a new blog post simply involves the following: adding an entry to _posts folder serving up the files to test locally: $ bundle exec jekyll serve --trace testing the rendered html for correctness:$ bundle exec htmlproofer ./_site --disable-external pushing & committing like any other cod EASY PEASY LEMON SQUEEZY!Install Chrome under WSL2018-05-24T16:08:00+00:002018-05-24T16:08:00+00:00/install-chrome-under-wslEven though WSL on Windows 10 is great at giving engineers a 'Nix-like environment on a Windows machine to develop, build and test; I ran into an issue where my Karma tests were failing due to the inability to find a suitable browser to execute the test suite one. I found some help online to help with installing a Chrome directly under the WSL. Below are the steps i used:<br /><br /><script src="https://gist.github.com/drexler/d70ab957f964dbef1153d46bd853c775.js"></script>dReXlerEven though WSL on Windows 10 is great at giving engineers a 'Nix-like environment on a Windows machine to develop, build and test; I ran into an issue where my Karma tests were failing due to the inability to find a suitable browser to execute the test suite one. I found some help online to help with installing a Chrome directly under the WSL. Below are the steps i used:Debugging on WSL: Breakpoint ignored because generated code not found(source map problem?)2018-04-27T00:41:00+00:002018-04-27T00:41:00+00:00/debugging-on-wsl-breakpoint-ignored<div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-dT1oOgAMqfU/WuJqlOsU6pI/AAAAAAAACNM/VkQ8PBQOoZsM4Ee5YYo2b7W8Je164svTgCEwYBhgL/s1600/untriggered-breakpoint.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="571" data-original-width="1600" height="228" src="https://4.bp.blogspot.com/-dT1oOgAMqfU/WuJqlOsU6pI/AAAAAAAACNM/VkQ8PBQOoZsM4Ee5YYo2b7W8Je164svTgCEwYBhgL/s640/untriggered-breakpoint.PNG" width="640" alt="Smile" /></a></div><br />VSCode's support for debugging Typescript-based projects is really second to none in my opinion, so much so that i made the switch from Atom. Atom, though, is still is my goto editor for all my Terraform & other configuration-related projects. Back to the topic at hand, debugging a React app on my Windows 10 machine with WSL enabled, my breakpoints were not triggered due to 'missing' source maps, even though the <i>tsconfig.json</i> had source maps generation enabled during transpilation. I had followed the debug configuration setup described <a href="https://code.visualstudio.com/docs/nodejs/reactjs-tutorial" target="_blank">here</a>.<br /><br /><b>Solution:</b><br />It turns out that running the application/debugging under the WSL, this extra property is needed to be added to <i>launch.json</i> in order for the source maps to be found: <i><span class="pl-s">sourceMapPathOverrides.</span></i><br /><script src="https://gist.github.com/drexler/a27e5a7f76b5fd173c7231c67d91ae05.js"></script> <br /><i><span class="pl-s"> </span></i><br /><br /><br /><br />dReXlerVSCode's support for debugging Typescript-based projects is really second to none in my opinion, so much so that i made the switch from Atom. Atom, though, is still is my goto editor for all my Terraform & other configuration-related projects. Back to the topic at hand, debugging a React app on my Windows 10 machine with WSL enabled, my breakpoints were not triggered due to 'missing' source maps, even though the tsconfig.json had source maps generation enabled during transpilation. I had followed the debug configuration setup described here.Solution:It turns out that running the application/debugging under the WSL, this extra property is needed to be added to launch.json in order for the source maps to be found: sourceMapPathOverrides. Infrastructure As Code - Terraform2017-04-19T11:44:00+00:002017-04-19T11:44:00+00:00/infrastructure-as-code-terraformWeb first! Cloud first! That's the mantra at work, where the goal is to eventually move all existing apps to the cloud. Woohoo! Toss in the need for the apps to be platform-agnostic after years of being wedded to Microsoft stack and it makes for exciting times. After a quick comparison of the major cloud vendors out there in terms of richness of offerings for what we intended, it's no surprise AWS won hands down. Prototyping by hand and running commands from the AWS CLI is okay for the trivial stuff like launching an EC2 instance or updating security groups. However, for things like CloudFormation where there was the need to launch and deal with multiple instances, the dependencies amongst themselves, correctly configured security groups based on generated IPs and manage all this within some source control; <a href="https://www.terraform.io/" target="_blank">Terraform </a>was the answer. It's incredible that it's open source meaning we have the ability to fork the code (which I have :) ) and tweak things as we see fit without waiting for the community to address them.dReXlerWeb first! Cloud first! That's the mantra at work, where the goal is to eventually move all existing apps to the cloud. Woohoo! Toss in the need for the apps to be platform-agnostic after years of being wedded to Microsoft stack and it makes for exciting times. After a quick comparison of the major cloud vendors out there in terms of richness of offerings for what we intended, it's no surprise AWS won hands down. Prototyping by hand and running commands from the AWS CLI is okay for the trivial stuff like launching an EC2 instance or updating security groups. However, for things like CloudFormation where there was the need to launch and deal with multiple instances, the dependencies amongst themselves, correctly configured security groups based on generated IPs and manage all this within some source control; Terraform was the answer. It's incredible that it's open source meaning we have the ability to fork the code (which I have :) ) and tweak things as we see fit without waiting for the community to address them.Optimized Builds with Docker2017-02-20T12:12:00+00:002017-02-20T12:12:00+00:00/optimized-builds-with-dockerToying with Docker in the past, it's true power wasn't really revealed until it became a necessity on a API project due to the need to ensure that installed dependencies were the same regardless of which machine - dev or the build server itself was used. Essentially, the process was to define two Dockerfiles - one for building the .Net Core app and another for hosting only the necessary "build bits" from the earlier build. The former reduced the need for the Jenkins server to have all the required packages installed on it, and the latter was as an immutable tiny but fast image one could run locally or in prod. Here's a more detailed instruction on how to do the same: <a href="https://blogs.msdn.microsoft.com/stevelasker/2016/09/29/building-optimized-docker-images-with-asp-net-core/" target="_blank">Building Optimized Docker Images with ASP.Net Core</a>. Hats off to the guys in Redmond!dReXlerToying with Docker in the past, it's true power wasn't really revealed until it became a necessity on a API project due to the need to ensure that installed dependencies were the same regardless of which machine - dev or the build server itself was used. Essentially, the process was to define two Dockerfiles - one for building the .Net Core app and another for hosting only the necessary "build bits" from the earlier build. The former reduced the need for the Jenkins server to have all the required packages installed on it, and the latter was as an immutable tiny but fast image one could run locally or in prod. Here's a more detailed instruction on how to do the same: Building Optimized Docker Images with ASP.Net Core. Hats off to the guys in Redmond!These aren’t the droids you’re looking for!2014-08-28T02:00:00+00:002014-08-28T02:00:00+00:00/these-arent-droids-youre-looking-forFor the past few days, i've spent time dipping my feet in the Android SDK and generally building baby apps to get a feel and understanding of them. This is in preparation for a larger app i'm building as part of my side projects at work so its fun all around. In the past, i've dabbled with Symbian S60 and pre- BB OS 7 but when it comes to app development the ease with which one can just get a working app installed and tested is astounding. Definitely loving it so far! dReXlerFor the past few days, i've spent time dipping my feet in the Android SDK and generally building baby apps to get a feel and understanding of them. This is in preparation for a larger app i'm building as part of my side projects at work so its fun all around. In the past, i've dabbled with Symbian S60 and pre- BB OS 7 but when it comes to app development the ease with which one can just get a working app installed and tested is astounding. Definitely loving it so far! Spring cleaning or sorts…2014-07-23T23:31:00+00:002014-07-23T23:31:00+00:00/spring-cleaning-or-sortsFinally got round to rebuilding my musty mouldy repo a couple of days back and began to check in some patches albeit easy ones just to get back in the groove. Next stop for me will be upgrading to VS2013. As usual, i need to have the best tool out there to<i> </i>play around with although emacs & Eclipse will continue to be my main preference for development. What can i say? A dork gotta have his <i>toy. </i>dReXlerFinally got round to rebuilding my musty mouldy repo a couple of days back and began to check in some patches albeit easy ones just to get back in the groove. Next stop for me will be upgrading to VS2013. As usual, i need to have the best tool out there to play around with although emacs & Eclipse will continue to be my main preference for development. What can i say? A dork gotta have his toy.