Strona głównaUżytkownik

maklipsa | użytkownik

maklipsa
maklipsa
1 215,08
1801 dni, 7 godzin, 29 minut temu
6 maja, 2016
dotnetomaniak.pl

Reading time ~4 minutes This post is an analysis of a very interesting optimization proposed by Nicholas Frechette in the comments under the previous post. He proposed to use one of the oldest tricks in performance cookbook - divide and conquer. Well, it did not turn out as I expected.Saga Before I go further here are some link to the previous posts on the problem of calculating similarities and then optimizing. This thread grew to a few post. Here are all of them:How I calculate similariti...

Dziel się z innymi:
Dividing a bit in two for performance – IndexOutOfRange

Sztuka programowania 2860 dni, 8 godzin, 53 minuty temu maklipsa 82 źrodło rozwiń

Reading time ~2 minutes This post was inspired by a discussion on Reddit that followed my previous post In this post, I will cover a suggestion by BelowAverageITGuy that cut down the total execution time by almost one hour. Saga Before I go further here are some link to the previous posts on the problem of calculating similarities and then optimizing it grew to few post. Here are all of them:How I calculate similarities in cookit?How to calculate 17 billion similaritiesIndependent code in ...

Dziel się z innymi:
Making bits faster – IndexOutOfRange

Sztuka programowania 2871 dni, 17 godzin, 9 minut temu maklipsa 96 źrodło rozwiń

Reading time ~1 minute This will be a fast errata to the previous one. This time I will expand the oldest performance mantra: The fastest code is the one that doesn’t execute. Second to that is the one that executes once Last time I’ve forgot to mention one very important optimization. It was one of two steps that allowed me to go from 1530 to 484 seconds in the sample run.Saga Before I go further here are some link to the previous posts on the problem of calculating similarities and then...

Dziel się z innymi:
Independent code in performance optimizations – IndexOutOfRange

Programowanie rozproszone 2884 dni, 18 godzin, 38 minut temu maklipsa 49 źrodło rozwiń

Reading time ~6 minutes Last time I’ve shown how I’ve gone from 34 hours to 11. This time we go faster. To go faster I have to do less. The current implementation of Similarity iterates over one vector and checks if that ingredient exists in the second one. Since those vectors are sparse the chance of a miss is big. This means that I am losing computational power on iterating and calling TryGetValue. How to iterate only over the mutually owned ones and do it fast? Saga Before I go furth...

Dziel się z innymi:
Using bit masks for high-performance calculations – IndexOutOfRange

Sztuka programowania 2881 dni, 18 godzin, 18 minut temu maklipsa 59 źrodło rozwiń

Reading time ~5 minutes The previous post described the methodology I’ve used to calculate similarities between recipes in cookit. If You haven’t read it I’ll give it 4 minutes because it will make understanding this post easier. Go one, I’ll wait. It ended on a happy note and everything seemed to be downhill from there on. It was until I tried to run it. It took long. Very long. How long? I don’t know because I’ve canceled it after about one hour. Going with a famous quote (probably from E...

Dziel się z innymi:
[EN]How to calculate 17 billion similarities – IndexOutOfRange

Sztuka programowania 2887 dni, 18 godzin, 13 minut temu maklipsa 141 źrodło rozwiń

Reading time ~5 minutes Warning this post contains some math. Even more, it shows how to use it for solving real life problems. This post describes how I calculate similarity between recipes in my pet project cookit.pl. For those of you that don’t know, cookit is a search engine for recipes. It crawls websites extracting recipe texts, then it parses it and tries to create a precise ingredient list with amounts and units. 182 184 recipes2936 ingredients This scale may not seem huge, but tr...

Dziel się z innymi:
How I calculate similarities in cookit? – IndexOutOfRange

Sztuka programowania 2895 dni, 17 godzin, 5 minut temu maklipsa 56 źrodło rozwiń

Reading time ~6 minutes This post is covering a subset of what I am talking in my talk How I stopped worrying and learned to love parallel processing (currently only in polish). This will cover on how, in terms of performance, AsParallel can kick you in a place where it hurts a lot, simultaneously being a blessing in terms of… performance. How is that? Let’s look at someHistory AsParallel was introduced as an extension to LINQ with TPL in .NET 4.0. In theory, it’s God’s sent. The promise w...

Dziel się z innymi:
[EN] Problems with AsParallel – IndexOutOfRange

Architektura 2897 dni, 15 godzin, 11 minut temu maklipsa 117 źrodło rozwiń

Reading time ~4 minutes Diagnosing high memory usage can be tricky, here is the second part of how I found what was hogging to much memory in our system. In the previous post I’ve wrote how to create a memory dump and how many possibilities of catching just the right moment for it ProcDump has. When trying to analyze memory leaks, or high memory usage (not necessary meaning a leak) we have a few ways to approach it: Attach a debugger There are many problems with this approach, to name a fe...

[EN]Debugging high memory usage. Part 2 - .NET Memory Profiler – IndexOutOfRange

Narzędzia 2950 dni, 9 godzin, 34 minuty temu maklipsa 90 źrodło rozwiń

Reading time ~2 minutes I’m taking a short break from Hangfire series, but I will get back to it. This time - Where did my memory go ? Or to be more exact: Why is this using so much memory? The story starts with one IIS application pool using around 6 Gigabytes of memory on one of our test environments. It was several times above the values that we expected it to use, so we decided to investigate. Without much thinking we fired up Visual Studio installed on the test server, and attached to the proce...

[EN] Debugging high memory usage. Part 1 - ProcDump – IndexOutOfRange

Architektura 2956 dni, 18 godzin, 50 minut temu maklipsa 64 źrodło rozwiń

Reading time ~6 minutes This is a sixth part of a series:part 1 - Why schedule and procrastinate jobs?part 2 - Overview of Hangfiepart 3 - Scheduling and Queuing jobs in Hangfirepart 4 - Dashboard, retries and job cancellationpart 5 - Job continuation with ContinueWithpart 6 - Recurring jobs and cron expressions Parts 3, 4, and 5 covered the BackgroundJob class responsible for enqueuing single jobs (fire and forget). This post will cover RecurringJob class exposing API for recurring jobs (as the name ...

Dziel się z innymi:
[EN] Don't do it now! Part 6. Hangfire details - recurring jobs and cron expressions – IndexOutOfRange

Architektura 2960 dni, 12 godzin, 11 minut temu maklipsa 45 źrodło rozwiń

Reading time ~3 minutes This is a fifth part of a series:part 1 - Why schedule and procrastinate jobs?part 2 - Overview of Hangfiepart 3 - Scheduling and Queuing jobs in Hangfirepart 4 - Dashboard, retries and job cancellationpart 5 - Job continuation with ContinueWithpart 6 - Recurring jobs and cron expressions Part 3 covered almost all functions in BackgroundJob class except for ContinueWith functions family. So here we go :) The fact that it has the same name as a System.Threading.Tasks.Task funct...

Dziel się z innymi:
[EN]Don't do it now! Part 5. Hangfire details - job continuation with ContinueWith – IndexOutOfRange

Architektura 2967 dni, 15 godzin, 24 minuty temu maklipsa 77 źrodło rozwiń

Reading time ~3 minutes This is the fourth part of a series discussing job scheduling and Hangfire details:part 1 - Why schedule and procrastinate jobs?part 2 - Overview of Hangfiepart 3 - Scheduling and Queuing jobs in Hangfirepart 4 - Dashboard, retries and job cancellation This part will cover few small topics:dashboardretriesmore technical part of the Hangfire.BackgroundJob class APIjob cancellationDashboard Let’s start with the administrative dashboard because it gives a good background for the ...

Dziel się z innymi:
[EN] Don't do it now! Part 4. Hangfire details - dashboard, retries and job cancellation – IndexOutOfRange

Architektura 2974 dni, 20 godzin, 37 minut temu maklipsa 55 źrodło rozwiń

Reading time ~2 minutes This is the third part of a series discussing job scheduling and Hangfire details:part 1 - Why schedule and procrastinate jobs?part 2 - Overview of Hangfiepart 3 - Scheduling and Queuing jobs in Hangfirepart 4 - Dashboard, retries and job cancellation This part will focus on the basic scheduling API of Hangfire. The easiest way to create a fire and forget job is by using the classHangfire.BackgroundJob and its minimalistic (and this is a complement) API of static functions:Enqu...

Dziel się z innymi:
[EN] Don't do it now! Part 3. Hangfire details - jobs | Joby asynchroniczne w tle z Hangfire – IndexOutOfRange

Architektura 2992 dni, 7 godzin, 33 minuty temu maklipsa 99 źrodło rozwiń

Reading time ~2 minutes In the previous post I’ve wrote about why I think the ability to schedule tasks for later execution is a fundamental technical feature, but also a must have from a business point of view. We are passed the whys, so lets get to the hows. The answer is simple - Hangfire. I’ve wrote about it here, here and here, so yeah, I like it. Hangfire is an amazing library. It has proved itself in my pet project (cookit.pl) and in a huge ERP system that we are building at work, where we repla...

Dziel się z innymi:
Don't do it now! Part 2. Background tasks, job queuing and scheduling with Hangfire – IndexOutOfRange

Architektura 3003 dni, 18 godzin, 53 minuty temu maklipsa 153 źrodło rozwiń

Just how long does garbage collection take in .NET? Which generation takes longer?

Dziel się z innymi:
[EN] The cost of garbage collection – IndexOutOfRange

Sztuka programowania 3015 dni, 14 godzin, 33 minuty temu maklipsa 124 źrodło rozwiń

One of the steps in cookit is calculating similar recipes. This is what you can see on the left on the recipe page like this For the sake of clarity and manageability it’s scheduled as separate Hangfire jobs. Because cookit is running 5 workers, so similarities are calculated for 5 websites concurrently. The process uses cosine similarity, so it allocates a huge list at start and calculates similarities. A very CPU heavy operation. So some time after triggering all recipes recalculation I saw this in...

GC can kill You. Practical GC performance counters in .NET – IndexOutOfRange

Inne 3075 dni, 13 godzin, 14 minut temu maklipsa 99 źrodło rozwiń

Anyone who made any HackerRank problems considering performance has seen this phrase in the assignment: “watch out for slow IO”. We are used to thing about files, databases and such as potentially slow IO, but the Console? Yes, and you will be amazed how much. Couple words about the setup. I am using NLog with file target (for normal logging) and mail target (for total failure, and aggregated reports). When debugging or profiling I run the process as a con...

Tagi: net, performance
Dziel się z innymi:
System.Console is slow – IndexOutOfRange

Inne 3084 dni, 13 godzin, 16 minut temu maklipsa 155 źrodło rozwiń

One of the main processes in cookit is dealing with extracting recipe information from raw html. I know it isn’t the most elegant solution but it is the only universal one. But to the point. Every web page goes through a process involving html parsing, stemming, parsing, and n-gram token matching. Then it’s saved to Sql Server and after transformation to Solr. So a lot of string manipulation, math calculations and from time to time mostly 0-gen GC. In the most pessimistic case this process has to be r...

Dziel się z innymi:
[EN] Local optimizations don't add up – IndexOutOfRange

Sztuka programowania 3085 dni, 13 godzin, 30 minut temu maklipsa 60 źrodło rozwiń

In the previous post I’ve written about new features in Neo4j. One of the new game changing functions were stored procedures. But, as I experienced, getting them to run on a Windows / .NET environment wasn’t that easy, and I was seeing “There is no procedure with the name …” more often then I wished for. So here is a short how to. Hope to save you some googling.

Dziel się z innymi:
[EN] Neo4j stored procedures for Windows – IndexOutOfRange

Bazy danych i XML 3102 dni, 17 godzin, 42 minuty temu maklipsa 25 źrodło rozwiń

Last week I had the opportunity to attend Graph Connect Europe. Many great sessions, but one thing topped them all - Neo4j 3.0 is out! And as with previous major release (it introduced Cypher) there are many bug fixes, tweaks, speed improvements, but here are my personal favorites.

Dziel się z innymi:
[EN] Graph Connect Europe 2016 – IndexOutOfRange

Bazy danych i XML 3106 dni, 12 godzin, 28 minut temu maklipsa 28 źrodło rozwiń

1 2