Simon Frey

Writing about tec, podcasting and other thoughts that come to my mind.

golang gopher benchmark wednesday

Known length slice initialization speed – Golang Benchmark Wednesday

I stumbled over the hint, that it is better for performance if you initialize your slices with a dedicated length and capacity if you know it. Sounds as it would make sense, but I wouldn't be me if I just accept that without testing that hypothesis.

An example that I am using in real life is for creating a slice of ids for querying a database later on with that ids. Iterating over the original data structure (in my case a 'map[string]SimonsStruct{Id int, MORE FIELDS}') and copying the ids out.

Normally I used 'make([]int,0)' (len == 0 & cap == 0), so let's see if that would be faster with initializing the slice directly with it the right capacity and length.

Keep in mind the tests only work if you know the size of the target slice upfront. If not, sadly this Benchmark Tuesday will not help you.

Benchmark Code

Bad: Initialize slice empty even if you know the target size

const size int //See size values benchmarked later in table
func BenchmarkEmptyInit(b *testing.B) {
	for n := 0; n < b.N; n++ {
		data := make([]int,0)
		for k:=0;k<size;k++{
			data = append(data,k)
		}
	}
}

Best for big size: Initialize slice with known capacity and add data with append

const size int //See size values benchmarked later in table
func BenchmarkKnownAppend(b *testing.B) {
	for n := 0; n < b.N; n++ {
		data := make([]int,0,size)
		for k:=0;k<size;k++{
			data = append(data,k)
		}
	}
}

Best for small & medium size: Initialize slice with known capacity & length and add data with direct access

const size int //See size values benchmarked later in table
func BenchmarkKnownDirectAccess(b *testing.B) {
	for n := 0; n < b.N; n++ {
		data := make([]int,size,size)
		for k:=0;k<size;k++{
			data[k] = k
		}
	}
}

Results

The table shows the time it took for every example to init all its elements (only measured inside the benchmark loop) Have an eye on the unit! (ns/ms/s)

#Elements (size) EmptyInit KnownAppend KnownDirectAccess
1 31.00 ns 1.52 ns 0.72 ns
100 852 ns 81.4 ns 59.1 ns
100 000 1.11 ms 0.22 ms 0.20 ms
1 000 000 10.76 ms 3.13 ms 3.14 ms
100 000 000 2.48 s 0.21 s 0.22 s
300 000 000 6.79 s 0.90 s 0.95 s

Interpretation

That initializing the slice with len & capacity 0 would be the worst was obvious, but I am still surprised that the append approach outperforms the direct access for bigger sizes.

But after tinkering about it total makes sense. The direct access approach needs to write every entry twice:

1) Initializing the whole array with its 'nil' value (in our case int with '0') 2) Writing the actual value into that slice

Step 1) is not needed with the append approach, as we just reserve a memory location but the previous values stay there until we write them in step 2. For bigger slices this setup overhead outweighs the performance benefit of direct access. This will be even more significant if the values in the slice are not only simple int but even bigger (e.g. struct with a lot of fields), as then the setup will have to initialize even more 'nil' values.

Conclusion

The hint I found only was right: If you know the size of your target slice always initialize it with that size as capacity. For medium & small size slices use the direct access approach. For very big slices use append.

Thanks for reading and see you next week!

You got any feedback? Would love to answer it on HackerNews

p.S. There is a RSS Feed

Microsoft Azure Anger

Last weekend I attended a Hackathon at Microsoft. Overall it was an awesome experience and I had a lot of fun, so this post has nothing to do with the event itself and neither does it reflect my overall opinion on Microsoft. They do awesome stuff in a lot of fields, but with Azure, they are definitely underdelivering.

During the event, I started to get in contact with the Azure platform. Our project idea was to create a website where you can search for news and then via sentiment analysis this news would be sorted by “happiness”. The news search and sentiment analysis are offered via Azures so-called cognitive services that abstract the ML models away and you let you simply use an API for accessing those services....so far so good. With this premise most of you coders out there will have the thought: “This sounds too easy to fill 24h of programming”. Exactly what I thought...and was already thinking about also coding an Alexa skill and so on to fill the time. With two experienced developers, we thought the backend would be done in about 4h (conservative calculation) as it would only be stitching together three APIs and delivering that info to a JSON REST API for our frontend team. For keeping the fun up and having more learnings during the project we decided to do the backend as a serverless function. But then Azure got into our way...

In the end, it took us ~9h to develop the backend as a serverless function consisting of mainly of a 40 line JavaScript file we had to develop in the in-browser “editor” that Azure offers as all the other approaches we tried didn't work out and we ended up abandoning them. Once again: 9 hours for 40 lines of JS code stitching together three APIs...that is insane. (Btw at 3 am we decided to switch to GCP (google cloud platform) and that did the job in about 45 minutes)

So for sure we did things wrong and it could have been done faster, but this blog post is about the hard onboarding and overall bad structure of Azure. Please also keep in mind that Azure is still in a more-or-less early stage and not all of it is broken. In the following, I will walk you through the timeline of this disaster and suggestions I would have in mind to fix some of the most confusing steps. Actually, I will try to avoid these mistakes in my own future projects, so thanks Microsoft by showing me a way how not to do things xD

Just a bit more background: My partner in the backend had some experience with GCP and I do most of my current projects with AWS, so we did know how things work there...couldn't be too hard to transfer that knowledge to the Azure platform.

Start of the project

So first of all creating a new Azure account, that is not that hard and after entering credit card info you get 100$ of free credit. I actually like how Microsoft solved that here: You have two plans. You start with the 100$ free tier and if you spend all of that money you manually have to change to the pay-as-you-go plan. So that protects you of opening up an account, doing some testing, forgetting about it and then a month later you get a huge bill (happened to me with AWS). So that is nice for protecting new users that just start to test the system. Good job here Microsoft!

After setting up the account I created a new project and added some of the resources we needed. Creating a serverless function I recognized the tag “(Preview)” on the function I created but didn't think more about it...but actually, that sign should be something like Experimental/Do not use/Will most likely not work properly. We created a Python serverless function (apparently Python functions are still beta there) and tried to get some code in there.

There are three ways to get code into an azure function:

  • Web “editor”
  • Azure CLI
  • VS Code

...for full-featured functions. As we selected the experimental/beta/preview functionality Python we only had the latter two options. Not that bad as it is the same for AWS and I am used to deploying my code via the AWS cmd...shouldn't be way harder with Azure.

My suggesting: Do not do publish functionality that is obviously not ready yet. Do internal testing instead of using your users for that task.

Azure plugins for VS code

Microsoft overs a wide range of VS code plugins for Azure. As that is my main editor anyways I wanted to give them a try. So for the functionality of serverless functions, you need the functions plugin and about 9 other mandatory ones that are some sort of base plugins. 50MB and three VS Code crashes later the required plugins were finally installed properly. The recommended login method did not work and I had to choose the method of authenticating via the browser instead. Not that big of a deal, but as they recommend the inline method one would think that should work. (Didn't work for the other folks in my team either...so it had nothing to do with my particular machine)

You would think that 500MB should be enough for finally being able to deploy some code...but you still need 200MB more for the Azure cli that is required for the plugins to work properly.

Finally having installed all of it you can see all your Azure functions and resources in VS code. I started to get a bit excited as it looked like from now on the development would be straight forward and easier as I am used to from AWS.

But that 700mb of code did not work properly....the most important function “deploy” failed without any detailed error message...AAAAAAARRRG. Why do I have to install all that crap and then it can't do the most basic task it has to do: get my code into their cloud.

Keep your tooling modular and try to do fewer things, but do them right

Code templates

A nice idea is that on creating a new serverless function Azure greets you with a basic boilerplate code example showing you how to handle the basic data interfaces.

It might have been because we selected the alpha functionality “Python”, that we didn't actually get Python code here but JavaScript. So your function is prepopulated with code that is not able to run because it is the wrong programming language. We were lucky and recognized that right away, but you could get really confusing error messages here if you then start developing in JS but actually having a Python runtime.

Better no boilerplate code than one in the wrong programming language

But at least it is colorful

So next try with the Azure CLI. The first thing that you recognize is that the CLI has all sorts of different colors...but that does not help if you are annoyed and want to get things done.

That is a thing you also see in the Azure web interface...it has got quite a few UX issues but they do have over five color themes that you can choose from for styling the UI...Microsoft I'm not sure if you set your priorities right here ;)

Also, the CLI did not get us where we wanted....either due to our own incompetence or due to the CLI itself, no clue. Either way, I would blame Azure as it is their job to help developers onboard and at least get basic tasks (we still only want to deploy a simple “hello world”) done in an acceptable time.

Focus less on making your UI shine in every color of the rainbow and try to improve documentation and onboarding examples

Full ownership of a resource still does not give you full privileges

After finally being able to deploy at least the “hello world” we wanted to go a step further...work concurrently on that project. Yes until now we mainly did pair programming on a single machine.

As I was the owner of that resource I also wanted to give my teammate full access to it, so that he could work on the resource and add functions if required. I granted him “owner” access rights (the highest that were available) but he was still not able to work properly with that function. In the web UI it did work more or less but than again in VS code there's no chance to do anything (adding a function or deploying it). I ended up doing something that goes against everything I learned about security: I logged in with my credentials on his machine.

So imagine yourself now already sitting in front of your laptop for about 4 ½ hours and you did not manage to do any of the actual work you set out to do.

Ditching Azure Functions and switching to GCP

That was the moment when we ditched the idea of doing the backend as an Azure function. We switched to GCP where we started all over again. As I've never worked with that platform either I expected a similar hard start, as I already had in the last few hours with Azure. But then about 25 minutes later we achieved more on GCP than with Azure until then.

Something both Azure and GCP do better than AWS is that they have the logs of a serverless function in the same window as the function itself. AWS has a different approach here and you have to change to the cloud logs when you want to get info about your function and how it worked. Props to both Google and Microsoft for solving this a lot better!

Actually a hint for AWS: Give your user all controls and info at a single place

Cognitive services

The prices you could win at the Hackathon were attached to using Azure and thereby we stuck to the cognitive services for doing the news search and the sentiment analysis. Overall the API is straight forward: Send your data and get the results back.

One thing we got told in a presentation and that you should keep in mind when using the cognitive services: You do not control the model and it could change at any moment in time. So if you use the cognitive services for productive use, you should continuously check that the API didn't change its behavior in a way that influences your product in a bad way. But most of the time it is still a lot cheaper and better than building the model yourself

The problem that we did have with the services were again authentication issues. Quite confusing some of the cognitive services (e.g. the sentiment analysis) have different API base URLs depending on where you register that cognitive service and others do not. As I assume they need that manual setting of data centers for a particular (unknown to me) reason. Indeed I would propose to have all the cognitive services bound to a location.

The news search, for example, is not bound to a location and so we had two different behaviors of the API base URLs in our so short and easy application:

  • One URL for all locations.
  • Only a certain location is valid for your resource. If you point to a wrong API location you get an “unauthorized” as the response

Pointing to the wrong location is pure incompetence on the developer side but it would help a lot if there would be a distinct error code/message for that scenario.

Have the same base URL behavior for all cognitive services

Return some sort of 'wrong location'-error if you have a valid API token but you are pointing to the wrong location

Insufficiently documented SDKs

Azure offers SDKs for using their services. We gave the JS SDK for the cognitive services a try. Here we had both ups and downs: First, props to the developers coding the SDKs, as they are straight forward and do what they should. Even the code itself looks good...but why the hell do I have to look into the code of the SDKs to get all the options the functions offer? When you stick to the documentation provided via the GitHub readme or NPM you only get a fraction of the functionality. We were confused that Microsoft's own SKDs seemed not to be API complete. Looking into the code we saw they are actually API complete and do offer a lot more options than documented.

Please Microsoft: Properly document your functionalities!

IMO there must be deep problems with the internal release processes at Azure. It is not acceptable that an IT company that's been in the industry for so long allows itself such a basic mistake. You should not release your products (and I see the SDKs as such) without proper documentation.

“Code Examples”

During our trial and error period of trying to get the JS SDK running, we stumbled upon the quickstart guide for the cognitive services Quickstart: Analyze a remote image using the REST API with Node.js in Computer Vision

Instead of using their own SDK and explaining how to use it they show you how to manually build an HTTP request in JS. Sure that can be helpful for new JS coders, but if you have an SDK for that particular reason...why are you not using it? Looks like the left hand is not knowing what the right-hand does.

Stick to one way of doing things. If you have an SKD, also use it in your quickstart guides for being consistent

Conclusion

In the end, we did port the code back from GCP to an Azure function (again ~1h of work). We selected JS instead of Python and coded completely in the web UI...that did work. I now know how real Microsoft business developers do their daily business...never leave the web UI and just accept that life is hard.

Microsoft failed to deliver a adequate experience here and lost me as a potential customer. How can it be that I was able to do the same things in a fraction of the time in GCP? (And keep in mind: it was already 3am in the morning, I was super tired and I also never worked with GCP before)

None of the three major players are perfect and sure I understand it is hard to deliver fast and keeping good quality in this highly competitive market. But maybe actually going the step further will help to win in the end.

Once again: This is me only rating the onboarding experience of Azure in particular! No general opinion on Microsoft.

Last one: The Azure web UI didn't work in Chrome. So if you have issues with that, Firefox did the trick for us ;)

Women in front of laptop

Why every SaaS company should reevaluate their live chat strategy in 2019

What would you say, if I tell you that you leave over 70% off your potential customers on the table, because you do not have a live chat on your website? FurstPerson discovered exactly that: 77% of your customers are very unlikely to make a purchase if we do not offer a live chat. Wow! That huge number should be enough to convince everyone to directly search for a live chat solution that gets that customers back.

The technical setup of a website live chat is the easy part. There are other points you have to think about and I hope that this article helps you to reevaluate your live chat strategy.

Customers demand live chat support

But email support worked now for a decade, why are customers becoming so demanding for a live chat support channel?

We are living in a time where we can get everything we want in no time. Do you sometimes find yourself being annoyed that you have to wait for the next day until your amazon delivery is at your door? Or when your spouse does not reply within minutes? Customers are as impatient as you are yourself!

When your (potential) customers have a question, they want to have it answered in a few minutes instead of waiting for a day until their email tickets are finally read. In 2019 customers do not excuse a slow support channel anymore!

So if competitors are able to answer the customers questions faster than you are, they will outperform you in sales and overall customer satisfaction.

We as entrepreneurs have to adapt to this new customer requirement to stay in the game! Every site should have a live chat possibility

A bad experience is worse than no live chat at all

Only one thing that is hurting a business more than no live chat is a bad live chat experience. You are not done with just putting a widget on your page and then configuring it to send you an email. I hate it if I open the live chat window, type in my question and then after 1 minute a bot tells me that the team is away and that I please should use some weird email form. Why the hell is there a live chat window at all if no one is answering in less than 2 minutes? If you use your chat that way, please send you customers directly to the email form and tell them how long they will have to wait on average for a response. You are than not 100% in line with the modern live chat situation, but it is still way better than after all having an email form that looks like a live chat window!

If your live chat is just another design for your email form, please do not use that live chat at all

Live chat support needs (wo-)manpower

Bots, Artificial Intelligence and Machine Learning are very nice for live chat company in their sales pitch but after all you always need a human being on the other side of your customer live chat. Technology will help you to a certain point, but the biggest value you create is your customer feeling appreciated. You show, that you and your company do care so much for them that there is always a human being happy to help with all their issues.

Did you ever have the situation a friend of yours told you about a company that helped super fast with an issue? I'm 100% sure that friend still is a customer at that company. We humans want to feel valued and if someone does that, we will stick to that person/company.

Value your customers with human support agents instead of heartless bots. This investment will definitely pay back!

Your live chat is your most honest feedback channel

Compared to dedicated customer services and feedback forms your live chat can and will be your most honest feedback channel. You will experience the problems your customer have with the product in the second they stumble upon it. And yes...sometimes just reading an FAQ or so would have helped your customer to circumvent that problem, but that is not how customers function. They want your product to make their life easier, and they will not work trough extensive manuals to understand how to use your product.

If you get the same questions over and over again, you definitely should think about changing your product at that particular point. And the best of all: You can just ask your customer during the live chat session what would help them to circumvent that problem in the future. They will feel valued and you get a customer survey for free ;)

Live chat helps you to understand just in time what problems your customer stumble upon. Use that feedback for improving your product

With great power comes great responsibility

One thing at the end. There are awesome features for pull marketing within some extensive live chat solutions, but please try to use them wisely. A lot of your potential customer will run away from your website screaming if you bombard them with popups and windows: “Here is our newsletter”, “Get 20% off”, “Start a live chat with us”

You can use certain triggers if you experience your customer stuck somewhere. Maybe they are hovering for 20 seconds over the pricing page or are extensively scrolling up and down on your page. Then it is a good point to offer them live chat support by automatically opening the live chat window. Just that they are on your site for 2 seconds is no valid reason!

Try to only automatically open the chat window if your customer seems stuck. Please do not annoy them with useless popups. They will find the live chat in the bottom right corner when they need it

Live chat solution for solopreneurs

Full disclosure: I am the co-founder of gramchat

With gramchat we tried to solve the bespoken issues and create a live chat solution that helps you to serve your customers best. Gramchat directly send the customer messages to your Telegram Messenger and from there you can answer them directly – no extra app required. Gym, beer with friends or during your day-job, help your customers where ever you are.

With the Telegram integration we try to solve the “live chat is just an email form”-problem. With gramchat you are able to answer your customer within the important first 90 seconds.

But enough advertisement! There are different great live chat solutions out there and you should pick the one that suits you the best. For a small team or as solopreneur, gramchat may be your perfect fit :D

I would love if you give it a try => gramchat.me

Wish you all an awesome time! Simon

No wifi on laptop image

As you might imagine from the title, I am at the moment of writing this article sitting in a train from Berlin to Hamburg. For those of you who have never been in Germany...we do have WIFI on the trains here, but contrary to what you might expect it is really bad. (And if it is sometimes fast enough, you get only 200mb of traffic <= Thanks to mac address randomization that can be bypassed)

Wait, what? Bad WIFI on trains in the first world industry nation Germany? Yes, even during my travel on a train in Thailand I had way better WIFI than I ever experienced in the German trains. There are two main factors for that:

  • Bad mobile network overall...if you leave the bigger cities you most of the time do not even have Edge (yes kids, slower than 2G) or a mobile network connection at all. So sad!
  • Cheap hardware in the trains. Actually the modems in the trains are standard 3G modems you may also purchase as mobile hot-spot device. Sure they are a bit more powerful, but they are not made for this special use case: Connection to new base stations in at a high ratio. It actually is a quite hard technical challenge to have a modem do this on a high speed train. But we have 2019 ...thinking about sending people to mars...and as we can see in other countries this problem is apparently solved. Maybe some more money would be good invested here.

But enough ranting about the WIFI in here (that is BTW current non existent)

OK sorry one more thought: Looking around me I see a lot of people in nice suites working on there laptops. Imagine them earning 60€/hour and they need double the time for a task, because the WIFI is so weak. Assuming there are 100 (conservative calculation) of such people on a train. So during this single trip from Berlin to Hamburg (2h) there is 60€ * 100 * 2 = 12 000€ of wasted human capital....better not tell that any company paying their employees the train ride and the “work time” during this trip.

Actually this article is about tech

I experience this not the first time, but why am I triggered this time that much, that I decided to write a blog post about this topic? As web developer I am currently working on a live chat project (gramchat.me – please be kind, the landing page would be finished if I actually could work here) where I wanted to finish the landing page & documentation during this trip.

Now I experience myself sitting here and my laptop, normally the device paying my rent, is not more than a dump black box....close to every work flow I have does requires the Internet, I can't work off-line. grrrrr

How could that happen? Normally I am always at places with good WIFI or mobile network (Berlin Big City Life) and so some bad habits sneaked in:

  • Development work
    • Google fonts
    • Payment gateway that needs to be configured
    • Documentation (How could anyone write software before stackoverflow?)
    • Package tools for just in time downloading of dependencies
    • Github Issues and Board for organization
    • Backend infrastructure is build on AWS lambda (can't test that offline)
  • Entertainment
    • Movies are on netflix
    • Music is on spotify
    • I read mostly blog posts and web articles (via Hackero ;))
  • Communication
    • Telegram/WhatsApp/Email
  • Information
    • I am struggling to write this article as non-native speaker as I can't use Google translate
  • ...and so on

Short interruption: Because of other issues I had to change to another (slower) train. This one does not have WIFI at all...so now next level shit.


I sit here and have basically three options what to do:

  • Compose electronic music with LMMS, what I downloaded a few weeks ago but have no clue how to use it :'D
  • Code something in Go. Thanks Goland for your awesome build in standard lib documentation!
  • Write this article ranting about the German train situation and about myself of being so depended on a resource I thought about as natural as air

So here I am writing the article :D

Prevent such a situation in the future

So the biggest fail, is me not being prepared for off-line usage of my devices. So what will I do to prevent this in the future? Technical problems need technical solutions:

  • Entertainment
    • Music: Have at least some of my favorite playlists available offline
    • Movies: Actually I see it not as a big problem not binch-watching for some hours => Keeps me focused on working
    • Get a off-line “read it later” system. A while ago I used wallabag and will reinstall it on all my devices.
  • Communication
    • You actually can not do much about it...so nothing to improve here
    • If you do not have an off-line usable email and messaging client you should get yourself one. (Telegram has a nice desktop standalone) It is nice to at least be able to search trough archived emails/chats
  • Information
    • Off-line dictionary it is
    • Is there a Firefox/Chrome Plugin that save all the web pages I visit to an off-line storage? So that I can go back in my history and access the pages I visited before...if not I might code one.
  • Development work
    • There are a lot different off-line code documentation systems. I did choose zeal as it works on Linux and is standalone (the other ones work in the browser and as I most of the time surf in private mode they would not work for me, as I wipe the local storage at least every few days)
    • Off-line PHP server => Was actually quite easy. Did you know PHP has a build-in server? php -S localhost:8080
    • AWS lambda offline testing framework? No clue how to this yet...maybe a good topic for another blogpost
    • There are actually some clients for github with offline issue support. I will give them a try
    • Cache/save web resources locally. Maybe a good idea overall..better not include Google as dependency in your project as they will abuse that data you send them with every visitor
    • There is an (sadly old) StackOverflow dump, that could end up in some tool to search trough it...would be amazing. (but maybe will take a lot of disk space)

Oh girl, another thing came up: I have to show my train ticket, wich is a PDF attached to an email...that I never downloaded. What is going on here...my life goes nuts without Internet. Download your important tickets/documents


So overall this trip showed me how depending I am on the Internet and that I should change that. Please see this post as work in progress as I will update and add off-line tools when I get to know them and have more experience with them.

Overall there is one main learning: Download stuff instead of only opening it in the browser. (Same here with my university pdfs...never did download them for offline use, so no research for me no)

If someone was in this situation him or herself and found out other tools that helped I would love if you share them with me, so that I can introduce them into my stack and update this article.

So now I hope that the Edge Internet connection I have on my mobile Hotspot right now will be enough to upload this article :'D

Wish you an awesome (online) time!

Simon

p.S. Another thing I found: Check what applications are using Internet on your machine, so that if you only have low bandwidth this important resource does not get sucked away by an open Skype or so.


Did you like this post?

Donate: Donate Button or Patreon

Feedback: Email_


RSS Feed – This work is licensed under Creative Commons Attribution 4.0 International License

No WIFI Icon made by Freepik from Flaticon is licensed by CC 3.0 BY

or a less click baity title: An introduction to net/http timeouts

Source: https://commons.wikimedia.org/wiki/File:Gophercolor.jpg


First of all, as you may already recognized from the titles, this blogpost is standing on the shoulder of giants. The following two blog posts inspired me to revise the net/http timeouts, as the linked blog post are at some parts outdated:

Give them a visit after you read this post and see how things have changed in such a short time ;)


Why you should not use the standard net/http config?

The go core team decided to not set any timeouts at all on the standard net/http client or server config and that is a real sane decision. Why?

To not break things! Timeouts are a highly individual setting and in more of the cases a to short timeout will break your application with a unexplainable error, than a too long one (or in GOs case none) would.

Imaging following different use cases of the go net/http client:

1) Downloading a big file (10GB) from a webserver. With an average (german) internet connection this would take round about five minutes.

=> The timeout for the connection should be longer than five minutes, because anything less would break your application by canceling the download in the middle (or third, or whatever percentage) of the file.

2) Accessing a REST API with a lot of concurrent connections. This normally should take at most a few seconds per connection

=> The timeout should be not more than 10 seconds, as anything that takes longer would mean, that you are keeping that connection open for to long and starving your application as it only can have X (depending on system, configuration and coding) open connections. So if that REST API you access is broken in any way that it keeps the connections open without sending you the data you need, you want to prevent it from doing so.

So, for what scenario should the standard lib be optimized? Trust me, you do not want to decide that for millions of developers around the globe.

That is why we have to set the timeouts, so that they fit our use case!

So never use the standard go http client/server! It will break your production system! (Happened to me, as I forgot my own rule ones)

What type of timeouts occur in a HTTP connection?

I assume you have a basic understanding of the TCP and HTTP protocols. (If not, Wikipedia is a good starting point for that)

There are mainly three different categories of timeouts that can occur:

  • During connection setup
  • During receiving/sending the header information
  • During receiving/sending the body

As you already might expect from our two examples in the introduction, the timeout that we have to care about the most is the one regarding the body. The other ones are most of the time shorter and similar in every setup. (E.g. there is only a certain amount of headers that will be send) We still have to think and care about timeouts in the header as there are certain DOS attacks that play with malformed headers, or never closing a header (SLOWLORIS DOS attack) but we will come to this in a later point of the post.

You should at least do this: The easy path

net/http gives you the possibility to set a timeout for the complete transfer of data (setup, headers, body). It is not as fine grained as with the later bespoken solutions, but it will help you to prevent the most obvious problems:

  • Connection starving
  • Malformed header attacks

So you should at least use this timeouts on every go net/http client/server you use!

Client

The following example client, gives you a complete timeout of 5 seconds.

c := &http.Client{
	Timeout: 5 * time.Second,
}
c.Get("https://blog.simon-frey.eu/")

If the connection is still open, it will be canceled with net/http: request canceled (Client.Timeout exceeded while reading ...)

So this timeout would work for small files, but not for download of a large file. We will see how we can have a variable timeout for the body later in the post.

Server

For the server we have to set two timeouts in the easy setup. Read and write. So the ReadTimeout defines how long you allow a connection to be open during a client sends data. And with WriteTimeout it is in the other direction. (Yeah it could also be, that you send data somewhere and the packages never get accepted TCP-ACK and your server would starve again)

s := &http.Server{
	ReadTimeout: 1 * time.Second,
	WriteTimeout: 10 * time.Second,
	Addr:":8080",
}
s.ListenAndServe()

So this server would listen on port 8080 and have your desired timeouts.

For a lot of use cases, this easy path may be enough. But please read on and see what other things are possible :D

[Client] In-depth configuration of timeouts

One thing to note before we get started here is the following differentiation:

  • Easy path timeout (above) is defined for a complete request including redirects
  • The following configurations are per connection.(As they are defined via http.Transport, which has no information about redirects itself) So if there happen a lot of redirects, the timeouts add up per connection. You can use both, to prevent endless redirects

Connection setup

In the following setup are two parameters, we set with a timeout. They differ in their connection type:

  • DialContext: Defines the setup timeout for an unencrypted HTTP connection
  • TLSHandshakeTimeout: Cares about the setup timeout for upgrading the unencrypted connection to an encryped one HTTPS

In a 2019 setup, you should always try to talk to encrypted HTTPS endpoints, so there are very rare cases where it makes sense to only set one of the two parameters.

 c := &http.Client{
    Transport: &http.Transport{
		DialContext:(&net.Dialer{
			Timeout:   3 * time.Second,
		}).DialContext,
		TLSHandshakeTimeout:   10 * time.Second,
    }
}
c.Get("https://blog.simon-frey.eu/")

With setting these parameters you define how long the setup of a connection should last at longest. This helps you with 'detecting' (for actually detection you have to do more than this few lines) of hosts that are down in a faster manner. So you are not waiting in your project for a host, that is/was down in the first place.

Response headers

Now as we have an established (hopefully HTTPS) connection, we have to receive the meta information about the content we get. This meta information is stored in the headers. We can set timeouts, how long we want the server to be able to answer us.

Here again are two different timeouts to be defined:

  • ExpectContinueTimeout: This configures how long you want to wait after you send your payload for the beginning of an answer (in form of the beginning of the header)
  • ResponseHeaderTimeout: And with this parameter you set how long the complete transfer of the header is allowed to last

So you want to have the complete header information ExpectContinueTimeout + ResponseHeaderTimeout after your did send you complete request

c := &http.Client{
	Transport: &http.Transport{
		ExpectContinueTimeout: 4 * time.Second,
		ResponseHeaderTimeout: 10 * time.Second,
    },
}
c.Get("https://blog.simon-frey.eu/")

With setting this parameters, we can define how long we accept the server to take for an answer and therefore also for internal operations.

Imagine following scenario: Your access an API, that will resize an image you send to it. So you upload the image and normally it takes ~1 second to resize the image and than start sending it back to your service. But maybe the API crashes of whatever reasons and takes 60 seconds to resize the image. As you now defined the timeouts, you can abort after a couple of seconds and tell your own customers that API xyz is down and that you are in contact with the supplier...better than having your fancy image editor loading for ages and not showing any status information, and that all because of a bug that is not even your fault!

Body

Per definition, the timeout for the body is the hardest, as this is the part of the response that will vary the most in size and thereby time it needs for transfer.

We will cover two approaches that help you to define a timeout on the body:

  • Static timeout, that kills the transfer after a certain amount of time
  • Variable timeout, that kills the timeout after there was no data transfered for a certain amount of time

Static timeout

We are dropping all errors in the example code. You should not do that!

c := &http.Client{}
resp, _ := c.GET("https://blog.simon-frey.eu")
defer resp.Body.Close()

time.AfterFunc(5*time.Second, func() {
	resp.Body.Close()
})
bodyBytes,_ := ioutil.ReadAll(resp.Body)	

In the code example we set a timer, that executes resp.Body.Close() after it finished. With this command we close the body and the ioutil.ReadAll will throw a read on closed response body error.

Variable timeout

We are dropping most of the errors in the example code. You should not do that!

c := &http.Client{}
resp, _ := c.GET(https://blog.simon-frey.eu")
defer resp.Body.Close()

timer := time.AfterFunc(5*time.Second, func() {
	resp.Body.Close()
})	
                 
bodyBytes := make([]byte, 0)
for {
	//We reset the timer, for the variable time
	timer.Reset(1 * time.Second)

	_, err = io.CopyN(bytes.NewBuffer(bodyBytes), resp.Body, 256)
	if err == io.EOF {
		// This is not an error in the common sense
        // io.EOF tells us, that we did read the complete body
			break
	} else if err != nil {
		//You should do error handling here
        break
	}
}

The difference here is, that we have a endless loop, that iterates over the body and copies data out of it. There are two options how this loop will be left:

  • We get the io.EOF file error from io.CopyN, this means we read the complete body and no timeout neds to be triggered
  • We get another error, if that error is the read on closed response body error the timeout triggered.

This solutions works, because io.CopyN is blocking. So if there is not enough (in our case 256 bytes) to read from the body it will wait. If the timeout triggers during that time, we stop the execution.

My 'default' config

Again: This is my very own opinion on the timeouts and you should adapt them to the requirements of your project! I do not use this exact same setup in every project!

c := &http.Client{
	Transport: &http.Transport{
		DialContext:(&net.Dialer{
			Timeout:   10 * time.Second,
			KeepAlive: 10 * time.Second,
		}).DialContext,
		TLSHandshakeTimeout:   10 * time.Second,
           
		ExpectContinueTimeout: 4 * time.Second,
		ResponseHeaderTimeout: 3 * time.Second,
		
        // Prevent endless redirects
        Timeout: 10 * time.Minute,
	},
}

[Server] In-depth configuration of timeouts

As there are no certain dial up timeouts for http.Server we will directly start into the timeouts for the headers.

Headers

For the request headers we have a certain timeout: ReadeHeaderTimeout, which represents the time until the full request header (send by a client) should be read. So if a client takes longer to send the headers the connection will time out. This timeout is especially important against attacks like SLOWLORIS as here the header never gets closed and the connection thereby will be kept open all the time.

s := &http.Server{
	ReadHeaderTimeout:20 *time.Second,
}
s.ListenAndServe()

As you may already have recognized, there is only a ReadHeaderTimeout, because for the sending of data to the client go does not have a certain distinction between the headers and the body for the timeout

Body

Here we have to differentiate between request (that is send from the client to the server) and the response body.

Response body

For the response body there is only one static solution for a timeout:

s := &http.Server{
	WriteTimeout:20 *time.Second,
}
s.ListenAndServe()

As long as the connection is open, we can not differentiate if the data was send correctly or if the client is doing bogus here. But as we know our payload data, it is quite straight forward to set the timeout here on our past information we have about our server. So if you are a file server this timeout should be longer than for a API server. You can set no timeout for testing purpose and track how long a 'normal' request takes. Add a few percent of variance there and then you should be good to go!

Request body

**Attention: If you did set the WriteTimeout it will have an effect on the request timeout as well. This is because of the defintion if the WriteTimeout. It starts when the headers of the request where read. ** So if reading from the request body takes 5 seconds and your write timeout is 4 seconds it will also kill the reading of the request body!

For the request body there are again two possible solutions:

  • Static timeouts that we can define via the http.Client config
  • Variable timeouts for that we have to build our own code workaround (as there is currently no support for that)
Static

For a static timeout we can use the ReadTimeout parameter we already used in the easy path:

s := &http.Server{
	ReadTimeout:20 * time.Second,
}
s.ListenAndServe()
Variable

For the variable timeout we need to work on the level of the handlers. Do not set a ReadTimeout, because the static timeout will interfere with the variable one. Also you must not set WriteTimeout as it is counted from the end of the request header and thereby also will interfere with the variable header

We have to define our own handler for the server, in our example we call it timeoutHandler. This handler does nothing than reading from the body with our loop and timeout if there is no data send anymore.

type timeoutHandler struct{}
func (h timeoutHandler) ServeHTTP(w http.ResponseWriter, r *http.Request){
	defer r.Body.Close()

	timer := time.AfterFunc(5*time.Second, func() {
		r.Body.Close()
	})

	bodyBytes := make([]byte, 0)
	for {
		//We reset the timer, for the variable time
		timer.Reset(1 * time.Second)
        
		_, err := io.CopyN(bytes.NewBuffer(bodyBytes), r.Body, 256)
		if err == io.EOF {
			// This is not an error in the common sense
			// io.EOF tells us, that we did read the complete body
			break
		} else if err != nil {
			//You should do error handling here
			break
		}
	}
}

func main() {
	h := timeoutHandler{}
	s := &http.Server{
		ReadHeaderTimeout:20 *time.Second,
		Handler:h,
		Addr:":8080",
	}
	s.ListenAndServe()
}

It is a similar approach to the one we did choose in the client. You have define this timeout loop in every handler you have separately. So you maybe should consider building a function for that, so that you don't have to rewrite the coder over and over again.

My 'default' config

Again: This is my very own opinion on the timeouts and you should adapt them to the requirements of your project! I do not use this exact same setup in every project!

s := &http.Server{
	ReadHeaderTimeout:20 *time.Second,
	ReadTimeout: 1 * time.Minute,

    WriteTimeout: 2 * time.Minute,
}

Conclusion

I hope you liked this blog post and it helped you to understand the different timeouts in go a little bit better. If you have any feedback, questions or just want to say 'Servus' (bavarian german for hello) do not hesitate to contact me!

Feedback: Email

Donate: Donate Button or Patreon

RSS Feed – This work is licensed under Creative Commons Attribution 4.0 International License


Sources

https://golang.org/pkg/net/http/

https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779

https://blog.cloudflare.com/exposing-go-on-the-internet/

Gopher Image (CC BY-SA 3.0): Wikimedia

Source: https://easylinuxtipsproject.blogspot.com/p/mint-xfce-old.html

During me being at my parents over the holidays (Christmas 2017) I had the usual IT-support stuff to do, that always happens to tech savvy kids when they are back at home.

As I am a happy Linux user for over a decade now, I asked myself if it would be a good idea to switch my parents away from Win 10 to a GNU/Linux (I will call it only Linux during the rest of the post. Sorry Richard ;) ) based system.

I did that and now 2 years later I still think it was a good idea: I have the peace of mind, that their data is kinda safe and they also call me less often regarding any technical issues with the system. (Yes, Win 10 confused them more than Ubuntu does).

In the following I would like to describe this ongoing journey and how you can follow my example.

The post is structured in three parts:

  • Preparation
  • Switching over
  • Ongoing improvements
  • Conclusion

Please keep in mind, that this setup is my very own solution and it is likely, that you need to tweak it to your needs. Disclaimer: I do not care about “FOSS only” or something.

Preparation

Background about my parents computer usage: They mainly use their machine for email and web stuff (shopping, social media, online banking,...) and are not heavily into hardware intense gaming or so.

As my parents already used a lot of Free Software as their daily drivers (Thunderbird, Firefox) I did not had to do a big preparation phase. But still I switch them (still on their Win 10) to LibreOffice so that they could get used to it, before changing the whole system.


That is my first big advice for your:

Try to not overwhelm them with to much new interfaces at once. Use a step by step solution.

So first of all, keep them on their current system and help them to adapt to FLOSS software that will be their main driver on the Linux later on.


So two steps for preparation here:

1) Sit down with your folks and talk trough their daily usage of their computer (Please be not so arrogant to think you already know it all)

2) Try to find software replacements for their daily drivers, that will work flawlessly later on the Linux machine. The ones I would recommend are:

  • Firefox as Browser (and maybe Email if they prefere webmail)
  • Thunderbird for Emails
  • GIMP for Image Editing
  • VLC as Media Player
  • LibreOffice instead of MS Office

So as you now did find out and setup replacements for the proprietary Windows software, you should give them time to adapt. I think a month would be suitable. (FYI: I got the most questions during this time, the later switch was less problematic)

Switching over

So your parents now got used to the new software and that will help you to make them adapt easier to the new system, as they now only have to adapt to the new OS interface and not additionally also to a lot new software interfaces.

Do yourself a favor and use standard Ubuntu

I know there are a ton of awesome Linux distros out there (Btw. I use Arch ;)) but my experience during this journey brought me to the conclusion, that the standard Ubuntu is still the best. It is mainly because, all the drivers work mostly out of the box and the distro does a lot automatically. (Because of that, my parents where able to install a new wireless printer without even calling me...beat that Gentoo ;))

On top of that: The Ubuntu community multilingual and open for newbies.

The journey until Ubuntu

Until Ubuntu we tried different other distros, all suffering at some point (Please bear in mind, that this are all awesome projects and for myself they would work 100%, but for no technical people as my parents a distro just needs to be real solid):

1) Chalet Os as it was promoted as most lookalike to Windows. As it is based on XFCE it is lightweight, but the icons and styles differ all over the UI. So you get confused because the settings icon always looks different, depending where in the system you are.

2) Elementary OS because I love the UI myself. No clue why, but my parents never got warm with it. It is just a bit to far away from what they are used to.

3) Solus OS has again a more windows looking ui and it worked better for my parents. But after all you have to say Solus is just not there yet. The package manager has to less packages and whenever you have a problem it is super hard to find a solution on the net. Plus: The UI crashed at least once a day. (IMO a driver problem with the machine, but still after hours of work we did not find a solution.)

4) Finally Ubuntu and that now works nice and smooth (For over 8 month now)

Nuke and pave

So you selected the distro and are now able to nuke and pave the machine. I think I do not have to explain in-depth how to do that, just two important things:

  • Backup all you parents data to an external hard drive (Copy the complete C: drive)
  • Write down upfront what software you want to install and make sure you also backup the configuration and data of those

Cheating: If you want to amaze with the new system even more and the machine is still on a HDD, replace it with a SSD, so the Linux system feels even better and faster ;)

Configuration

After you installed the distro, do a complete configuration. (Yes, go trough every setting and tweak it if needed)

Now install the software your folks already used on their Windows machine and make sure it is configured in the exact same way as it was on the old system! (That will help a lot in keeping the moral up, because then their is already something that feels familiar to them)

I found, that it is best to place the shortcuts of the applications your parents use the most in bar on the left side on Ubuntu, so they find them easily

Sit down with your parents and ask them, what data the need from the old system and copy only that over. Hereby you clean up the file system by not copying over the old crap they did not use for ages and if they find out later, that there is more data they need it is stored on the backup drive.

Introduce them to the new system

After the configuration and setup is now complete you need to allocate some time for introducing them to the new system. You know you parents best so do it in the way the like it.


For me the following routine worked best:

0) Explain it to them in two individual sessions (as mostly one of them is more tech savvy then the other one and so both have the chance to ask you individually)

1) Shutdown the machine

2) Let him/her start the machine

3) Tell her/him to try to do their daily business and whenever questions come up explain how to solve the issue (Never touch the mouse or keyboard! If you take it over, it is very likely that you will be too fast)

4) Stop after 60 minutes and if there are still questions do another session the next day (Imagine yourself learning something completely new to you – maybe Chinese – are you able to concentrate more than an hour?)


Some topics I would recommend you to cover during the introduction:

  • How to setup a new wifi connection (especially if the machine is a laptop)
  • How to install new software
  • How to setup a new printer/scanner
  • How to print/scan
  • How to restore deleted files
  • How to get data from/to a USB-stick or mobile device
  • How to shutdown the machine (not that easy to find on Ubuntu)

Ongoing improvements

So normally now the system should work as intended and if you are lucky it saves you a lot of problems in the future. In this section I will give you some more recommendations, that helped to make the experience even better:

  • Linux does always ask you for your password if you are doing something that could deeply harm the system. So I told my parents: Whenever that dialog (I showed it to them) pops up, they should keep in mind, that they could destroy the whole machine with this operation and if they want they can call me first.

  • Show them the app store and tell them, whatever they install from there is safe (so no viruses or something) and they can install everything they want as long it is from there. It makes fun to find new cool software and games, so help them to experience that fun too :D

  • Backups! As it is really easy with Linux you should do a automatic daily/hourly backup of their complete home folder. I use borg for that. (I plan to to write an in-depth blog post about borg in the future, it will be linked here if it is done). So now, whenever my parents call me and tell me that they deleted something or that the machine does not boot anymore I can relax and tell them, that we can restore all there data in a matter of minutes....you can't image how good that makes me feel.

  • It is not FOSS, but I did install google chrome as it was the easiest for watching netflix and listening to spotify.

  • I would recommend installing some privacy plugins and stuff into the browser your parents use, so you get them even safer.

  • If you have some software that does not have a good replacement, try to use wine for it. Worked well with MS Office 2007. (Sorry LibreOffice, but you still can't compete with MS here). PlayOnLinux did help me a lot with the wine setup

Edit 21.01.2018 – 10am: Thanks for all your feedback recommending WPS Office! We tried it a while ago and it was less stable than the Wine+MS Office setup, that is why we dumped it again. Another reader suggested OnlyOffice which I will give a try, as I did not know it before

  • If possible activate the automatic update and installation of all security updates.

Conclusion

For me the switch made a lot of sense, as my parents are not heavy technical users of there systems. Should yours be into Photoshop, video editing or gaming I do not think it will be so easy to do the switch over, as Linux and its software is still not a good competitor in this areas.

I would love to get your feedback on this blog post: Did you switch your parents to Linux and how did that work out? Do you have other insights that should be added to this post? Hit me up via meet@simon-frey.eu

Thanks for reading! Simon Frey

p.S. One reason why my parents machine did not boot anymore for several times, was a plugged in usb stick and the bios tried to boot from it. So do not forget to reset the boot order to first boot of the hard drive ;)


Did you like this post?

Donate: Donate Button or Patreon

Feedback: HackerNews or Email


RSS Feed – This work is licensed under Creative Commons Attribution 4.0 International License

Old Tux Image Source (CC BY-SA 3.0): https://easylinuxtipsproject.blogspot.com/p/mint-xfce-old.html

Source:  https://wordpress.org/plugins/gdpr/

As we all know there is this new privacy monster in Europe: GDPR.

For all us content creators/bloggers out there providing their services to Europeans a hustle began: We had to adapt to the new regulatory. Always with the fear in mind, that if we fuck it up we will be faced with huge penalties.

So I searched trough my websites: Remove Facebook Like Button, Remove analytics, Add privacy statement, Add cookie opt-in/out .... the list goes on.

As I always was privacy aware, my websites where mostly no super big problem. Just removing some scripts and it worked out fine.

But there where also some websites for which it would be a huge hustle to get them GDPR compliant. (A web app saving users PDFs and IPs, for example). And for that projects it was not worth the work, so I shut them down.

Schrödingers Website

Source: https://en.wikipedia.org/wiki/Schr%C3%B6dinger%27s_ca

One project I ditched was datenschutzhelden.org (Main GDPR problem: comment function). You now maybe wondering: Why is he linking to a dead page?

Actually the page is not completely dead! I don't host it anymore and it only lives in the wayback machine (WBM). All the requests to the domain get forwarded to the WBM.

As I don't own the WBM and none of the content is served from my servers I see myself being compliant with the GDPR by handing the risk over to someone else. (Thanks to the Internet Archive <3)

By being offline and online at the same time we have the Schrödingers website paradox :P

But fun aside: I'm not a lawyer and have no clue if that really solves my problems with the GDPR as I still own the domain. So see this article as an idea for yourself and not as legal binding statement.

Important Notice: datenschutzhelden.org was NOT closed because of GDPR. We already closed it earlier (february 2018), mostly because of time issues. But after the end of the project, the page was still up and running as archive on my server. (And me still being fully responsible for it). In march I decided to move it into the WBM to don't have it on my servers anymore.(And hopefully keep legal issues away)

The first part told you why I decided to go that path and what the end result will look like. In the second part you will get to know how you can do the same.


How to archive your website yourself

Because the WBM only saves the pages that are requested directly via their search, normaly only the main few pages of your blog get saved. (Or you have quite some manual work to do by searching all your subpages).

That looks like a problem one could automate :D

As I didn't want to add all my subpages manually I created a tool for doing so:

Save to web.archive.org tool logo

The 'Save to web.archive.org'-Tool

By following a simple set of instructions (next block) you can automatically scrape your page for all subpages and add them to the WBM. (With all request being proxied to circumvent rate limits).

Using the 'Save to web.archive.org'-Tool

Installation

I assume you have already installed go. (Go installation manual)

Dependencies

Download the dependencies via go get

Execute the following two commands:

go get -u github.com/simonfrey/proxyfy
go get -u github.com/PuerkitoBio/goquery
Download tool

Just clone the git repo

git clone https://github.com/simonfrey/save_to_web.archive.org.git

Execution

Navigate into the directory of the git repo.

Run with go run:

Please Replace http[s]://[yourwebsite.com] with the url of the website you want to scrape and save.

go run main.go http[s]://[yourwebsite.com]

Additional commandline arguments:

-p for proxyfing the requests

-i for also crawling internal urls (e.g. /test/foo)

So if you want to use the tool with also crawling interal links and use a proxy for that it would be the following command

go run main.go -p -i http[s]://[yourwebsite.com] 

[NGINX] Redirecting your domain

It's as simple as adding a single magic line to your site config:

Please replace http[s]://[yourwebsite.com] with your own website

return 301 https://web.archive.org/web/http[s]://[yourwebsite.com]$request_uri;

Now all the request to your website get forwarded to the WBM. It even keeps your old links alive.


Hope I was able to help you a bit with this article. See your pages soon in the WBM :D


Did you like this post?

Donate: Donate Button or Patreon

Feedback: HackerNews or Email


This work is licensed under Creative Commons Attribution 4.0 International License