#MultiThreading in #Python #Flask Web API

First of all, a real-world scenario, before we dig into a simplified code demo to show a generic solution. Let’s say that we want to implement a web site and connect it to a third party Web API (either weather or trading or what you prefer) and on top of that we want to keep track of the api results in an internal db for their versioning .

The idea

The high level solution is that you call the API online and you retrieve the expected result also from your previously saved data history in your app db. Then if they are different, you insert a new history record. So far, very cool idea, now imagine an optimization. The first two data retrieval operations – from the third party api and from your db – can go parallel with two IO threads. Finally you will do the comparison and eventually save if it is needed. To simplify the code, let’s just do a trivial int sum instead of the comparison in the real business logic and let’s make an effectful code that writes to a log instead of a real db.

Enters python Flask

From the user point of view you want to return the async Flask API to the browser ASAP. In the above scenario, you simply want to return the first third party API result as soon as it completes. Hence the conclusion is that you will write another thread to wait the DB operation of the second thread and finally compose their results (comparing the result of the third part API from the first thread with the result from your db history from the second thread), so that you can return from Flask API in the shortest possible time. Find below my module for this demo:

import threading
import time

class ThirdParty(threading.Thread):
     def __init__(self,low,high):
         super(ThirdParty, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         #call the third party API
         self.total = #retrieve the result

class ComparingThread(threading.Thread):
     def __init__(self,total1,thread2,logger):
         super(ComparingThread, self).__init__()
         self.total1=total1
         self.thread2=thread2
         self.result=0
         self.logger=logger

     def run(self):
         self.thread2.join()
         # At this point, both threads have completed
         self.result = self.total1 - self.thread2.total
         # compare the results and save the update to DB accordingly
         time.sleep(10)
         self.logger (self.result)

def test_threading(logger): #app.logger.info
    thread1 = ThirdParty(0,500000)
    thread2 = RetrieveFromDB(0,500000)
    thread1.start() # This actually causes the thread to run
    thread2.start()
    thread1.join()  # This waits until the thread has completed
    thread3 = ComparingThread(thread1.total,thread2,logger)
    thread3.start()
    logger('return flask result to users\'s browser thread 1: ' + str(thread1.total))
    return "client side done now!"

And here it is the Flask server part:

@app.route("/flask/api/threadtest")
def thread_test():
    return flaskthread.test_threading(app.logger.info)
    

Finally the client javascript:

function myFunction() {
  document.getElementById("demo_threading").innerHTML = "test multi-threading log: started...";
      var ipThreading = "/flask/api/threadtest";
      $.ajax({
          url: ipThreading
        })
        .done(function( data ) {
          console.log( "ipThreading data", data );
          var d = new Date();
          var n = d.toLocaleTimeString();
          document.getElementById("demo_threading").innerHTML = n + " - test multi-threading log: " + data;
        })
        .fail(function(e) {
          document.getElementById("demo_threading").innerHTML = "test multi-threading log: " + data;
        })
        .always(function() {
          var d = new Date();
          var n = d.toLocaleTimeString();
          console.log( n + " - ipThreading complete" );
        });
    };

F#: mixing W# React with W# UI

Ok, recently I’ve tried to port Jordan Marr’s Grouping from Fable to WebSharper.

First of all I understood that we could Inline the call to the React hook

[<Inline "React.useState($0)">]
let UseState (init: 'T) = X<'T * ('T -> unit)>

And Loïc Denuzière, the guru of WebSharper, helped my to define a React Functional Component

let FunctionComponent 
    (f: 'props -> React.Element) (props: 'props) 
    : React.Element =
    React.CreateElement
        (As<Func<obj, React.Element>> f, 
        box props)

Ok, well, now the nice thing is putting together the power of WS.UI with the availability of react components through W# React.
That implies that one can interoperate between W# UI and W# React. For example I can set a WS.UI var from a WS react component! And I am able to do a similar thing also the other way around: I mean setting a React state (let’s say created by the new useState) from WS.UI

The secret for the last thing is trasforming the setState returned from React in a functional F# mutable first class citizen function.

    let mutable setCount  = fun (i:int) ->  ()
    
[<JavaScript>]
module HelloWorld =
        
    let Example () =
        let count, setCount = React.UseState 0
        React.setCount <- setCount

Here you can find the magic snippet running online from Try W#.

Finally, notice that we can integrate W# React and W# UI also at DOM level in a single hole template.

div [
    on.afterRender ( fun el ->
        ReactHook.React.FunctionComponent 
            ReactHook.HelloWorld.Example ()
        |> WebSharper.React.React.Mount el
    )
] []

F#, WebSharper and Sql type Provider in action

I’m going to describe a real-world business requirement change in a web business app and how I can manage it with F# language and WebSharper platform.

I have an input and I want to make it multi-value

Well, I start from the code managing the DB, layer, where I have an Active Pattern based on the Selection type.

type ParsedTrade = NoSelection | IntSel of int

Where I will turn an int into an array of int:

type ParsedTrade = NoSelection | IntSel of int[]

Now the magic incantation starts with FSharp.Data.Sql Operators:

| IntSel i -> 
    query {
    for t in context.OilPhysical.EndurTradeValid do  
    join n_del in (!!) context.OilPhysical.EndurNominationValid on
             ((t.DealNumber,t.ParcelGroupId,t.ParcelId) 
             = (n_del.DeliveryDealNumber,n_del.DeliveryParcelGroup,n_del.DeliveryParcelId))
    join n_rec in (!!) context.OilPhysical.EndurNominationValid on
             ((t.DealNumber,t.ParcelGroupId,t.ParcelId) 
             = (n_rec.ReceiptDealNumber,n_rec.ReceiptParcelGroup,n_rec.ReceiptParcelId))
    join id_cost in (!!) context.OilPhysical.EndurProfitCenter on 
            (t.CostCenterId = id_cost.InternalId)
    join id_profit in (!!) context.OilPhysical.EndurProfitCenter on 
            (t.ProfitCenterId = id_profit.InternalId)
    where (t.BookingCompanyShortName = book 
        && ((isByTrade && t.DealNumber |=| i) || 
            (isByCargo && n_del.CargoId |=| i) || 
            (isByCargo && n_rec.CargoId |=| i) || 
            (isByCpty && t.ExternalLegalEntityId = (string i)) ))
    take 90
    select (t,n_del.CargoId, n_rec.CargoId, id_cost.EndurId, id_profit.EndurId)  
    }

Cool: the = becomes a |=| ! The Visual Studio 2019 IDE is guiding me and the language features are elegant, succinct and helpful.

End of the DB layer, now we are in WebSharper, server code, the remote API, and we have to adapt the Active Pattern for the array instead of a single int. How can I modify the following lines?

let (|Valid|_|) (str:string) =
    if String.IsNullOrWhiteSpace str then 
        Some(SqlDB.NoSelection) 
    else
        match Int32.TryParse str with
        | false, _ -> None
        | true, num ->
            if (num > 0) then Some(SqlDB.IntSel num) else None

My idea is to take advantage of the full functional programming power of F# here on the server side. So, I define my parser

let Int32TryParseArray (str:string) : int[] option =
    let parsed =
        str.Split(',')
        |> Array.map Int32.TryParse
    if parsed |> Array.exists(fun (ok,_) -> not ok) then None else
    parsed |> Array.map snd |> Some

and I use it in just one line!

let (|Valid|_|) (str:string) =
    if String.IsNullOrWhiteSpace str then 
        Some(SqlDB.NoSelection) 
    else
        Int32TryParseArray str |> Option.map SqlDB.IntSel

And it looks like the Server part is done! Let’s go for the Front-End, but thanks to the isomorphic WebSharper, we always see our friend F# also there! The technical requirement there is more complicated: I have to capture an other int from a string message in case of a certain alert type and pass it to the API in a comma separated format. Thanks to WebSharper I don’t have to deal with any JavaScript quirks.

The starting point, the code as it is before the change, is already very advanced, with F#  Anonymous Record Types needed to feed a jquery jtable… See the line that I will change

"AlertKey" => 
    {|
        title = "Key";
        list = true;
        sorting = true;
        edit = true;
        create = true;
        display = fun data -> 
            let param1 = {param_name = ServerModel.SelAlertKey; param_id = data?record?AlertKey}
            let param2 = {param_name = ServerModel.SelAlertType; param_id = alert2entity data?record?AlertCode}
            let post_form = {key_param_id = data?record?AlertKey; post_params = [| param1; param2|]}
            post_form  |==> EndPoint.Table 
            |> ToHtml
    |};

Into the new logic required for the multi-key enhancement:

let code : string = data?record?AlertCode
let msg : string = data?record?Message
let isMultiKey = code.StartsWith("A32") || code.StartsWith("A91")
let param1 = {param_name = ServerModel.SelAlertKey; param_id = 
    (if isMultiKey 
    then 
        data?record?AlertKey 
        + "," + 
        (msg.Split(' ') |> Array.last) 
    else data?record?AlertKey)
    }

Finally, I was showing a PL when the selection was a cargo number, therefore now I’ve to specify a when clause inside such a pattern matching

match selection with
| IntSel cargoIds when cargoIds.Length = 1 ->
    // .... showing the cargo PL
| _ -> 
    // ... as before

And you know… it will work at first compile!

Is Partition By more performant than Group By?

In the past I’ve tried to search over internet and read some tutorials and the message I’m getting (also from github comments on sql related project repositories, like this from linq2db) is that new, modern window functions are more efficient than old, traditional grouping.

WindowFunctions

By the way, regarding linq2db and similar patterns (sqlalchemy in python and so on): notice how it enables you to use the new powerful features and so abstraction doesn’t hide complexity, when digging into details is your job.

Traditional, plain old way:grouping

For example this could be the first part of the simplified query.

SELECT distinct
csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num,
parcel.price as parcel_price
FROM comm_schedule_header csh
INNER JOIN ab_tran ab ON  ab.ins_num=csh.ins_num
inner join (select deal_tracking_num, max(ins_num) as ins_num
from ab_tran group by  deal_tracking_num) gab
on  gab.ins_num=csh.ins_num

The effort to avoid the distinct and to optimize the query is not worth it (not even sure it would be much better than the original, eventually).

In the past I’ve tried to search over internet and read some tutorials and the message I’m getting (also from github comments on sql related project repositories, like this from linq2db) is that new, modern window functions are more efficient than old, traditional grouping.

New, modern and more performant approach.

The modernized, completed and reduced SQL, after deleting lines not relevant to the topic, is:

select cargo_id ,delivery_id , parcel_id , deal_tracking_num ,
parcel_price
from
(SELECT csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num,
parcel.price as parcel_price,
row_number() over(
    partition by ab.deal_tracking_num, csdc.cargo_id, csh.delivery_id, csh.parcel_id
    order by ab.ins_num desc) as ab_rn
FROM comm_schedule_header csh
INNER JOIN ab_tran ab ON  ab.ins_num=csh.ins_num
INNER JOIN comm_sched_delivery_cmotion csdc ON csh.delivery_id=csdc.delivery_id
left join (
    select parcel_id, ins.price, row_number()
           over(partition by fee.fee_id, fee.parcel_id order by fee.ins_num desc) as RN
    from INS_FEE_PARAM fee
    inner join ins_price ins on ins.ins_num = fee.ins_num
    and ins.param_seq_num=fee.param_seq_num and ins.pricing_source_id=fee.fee_seq_num
    and ins.pricing_vol>0 and ins.price>0
    where fee.fee_def_id=2
    ) parcel on parcel.RN=1 and parcel.parcel_id=csh.parcel_id
WHERE csh.bav_flag=1
AND csh.volume_type = 8
AND csh.last_update>=DATE '2020-03-01' -- example of parameter selection
AND csh.delivery_id0
ORDER BY  csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num) dash
where dash.ab_rn = 1;

When I time it, I see, for example, 3731 results in about less 10 seconds from Oracle SQL Developer version 18.1 as a client, for a certain date parameter selection (as written above, DATE '2020-03-01'), but I consider this timing not reliable, also because I have to click CTRL + END to fetch all rows (among other reasons, e.g the variance of statistics, possible caching effects, etc…).

My DB is Oracle Database 12c Enterprise Edition – 64 bit, but of course this topic applies much more broadly.

Scientific method, query plan comparison

A more scientific approach, as suggested by DBA experts, is comparing the query plans. Given the permissions I have, I compared the explain plans of the two queries and I roughly looked at the total cost of the select statement. The cost of the window function version, 12793, as per the html rendered here, is the minimum cost of the compared explain plans, so again this confirms that the sql in the question is, up till now, the best one, so the window functions appear more performant than just grouping.

This is the explain plan, exported as html.

C# vs Python vs F#… and the science of R0 in epidemiology

Recently I’ve studied many tutorials and video courses about Python. I also did an amount of practical exercises with real world projects in Python, on my Linux Fedora notebook and even in CPython on Android.

Here, in this post, the focus is on Pandas, since in this moment we are all concerned with stats, extractions from either Excel files or any big data (hype). By the way I will add here a couple of mentions of scientific papers, that mathematically and seriously explain the context and the limits of R(0), R(t), the reproduction number.

Now, look, in terms of quality of Excel data extraction, nothing is better than C# or Python openpyxl. For example you can easily and accurately get a DateTime cast from multiple ClosedXml cells composing together their values. Or in Python openpyxl

>>> worksheet = workbook.worksheets[0]
>>> type(worksheet)
<class 'openpyxl.worksheet.worksheet.Worksheet'>
>>> worksheet.cell(2,24).value
datetime.datetime(2020, 4, 1, 0, 0)
>>> worksheet.cell(2,25).value
datetime.time(12, 12, 3)

Screenshot from 2020-04-13 21-32-04

Python Pandas makes sense only for big data with standard, well defined and simple input format. F# for such standard input format as well and for pure, abstract, logical aggregation, especially inside .NET web platforms. C# or Python openpyxl are the most powerful for any disparate format, typical of business use cases. Anyway, notice that the performance might be an issue for the latter

Memory use is fairly high in comparison with other libraries and applications and is approximately 50 times the original file size, e.g. 2.5 GB for a 50 MB Excel file.

So C# appears to be the best, eventually

Finally, a note about two interesting, worth reading, scientific papers:

Notes On R0, James Holland Jones, Department of Anthropological Sciences, Stanford University, from which I mention:

his result has a nice geometric interpretation. The ESS virulence occurs where a line rooted at the origin is tangent to the curve that relates β to δ. This result is known as the Marginal Value Theorem and has applications in economics and ecology as well as epidemiology.

Theory versus Data: How to Calculate R0?
Romulus Breban, Raffaele Vardavas, Sally Blower – from this paper another very important and instructive lessons for our time:

Another approach (which is more commonly used) is to obtain R0 from population-level data, namely cumulative incidence data… It is very important to note that the individual-level modeling assumptions cannot be verified using population-level data (i.e., they remain hypothetical). ODE models are formulated in terms of disease transmissibility and progression rates at the population level. These parameters are obtained by fitting the model to population-level data; their relation to the individual-level processes may be quite complex and is generally unknown… Therefore, population-level predictions based upon an ODE model that use the R0 value found by contact tracing as a threshold parameter may be inaccurate.

Our novel results have significant implications for understanding the dynamics of outbreaks of infectious diseases, particularly for the biological understanding of the transmission dynamics of the pathogen, estimating the severity of outbreaks, making health policy decisions, and designing epidemic control strategies.

 

 

 

#Angular #Reactive Forms with @MaterialUI

This is extremely modern and cool.

Reactive

The demo app is published on firebase.

Look at the final code: it is a couple of angular data binding from a reactive form control to an interpolation through a function. This is so similar to the Model-View-ViewModel in desktop apps. My favorite F# stack solution, WebSharper, has, of course, reactive views and vars (I don’t like Elmish approach instead).

Here the point is that I like Angular – instead of React – Material UI. So I’m forced to use TypeScript. See this thread for clarifications.

By the way, moment.js is the de facto standard for dates on the frontend.

I’ve shared the whole project here on stackblitz!

Mutable, Shadowing, Recursion

After some days of analysis and complex examples of programming, based on adventofcode 2019, I have derived some useful theorems about functional programming.

There are 3 concepts we need to understand: mutability, shadowing and recursion.

80f7a-03oqjrnpedarirdsu

F# has different forms of variables: immutable and mutable (plus byref like C# and ref like OCaml).

First theorem of shadowing

You can always represent the changes of an immutable functional-style variable with shadowing.

Second theorem of mutability

Aside from while loop (and record type fields, if you need a single instruction for multiple assignment), you can always represent the changes of a mutable imperative-style variable with shadowing.

Recursion

You have 2 options to transform an imperative-style while-loop with mutable variables to a functional-style code

  1. tail-recursion if it is doable, but this is not always possible in F# (as opposite to haskell) without reintroducing  stack explosion in some edge cases, because F# is a strict , eagerly evaluated FP while haskell is non-strict, lazy
  2. you can always Seq/List/Array .unfold and in this way you can represent the imperative loop as a functional state machine (see most of AoC2019 solutions on github for practical, albeit complex examples)

 

 

F# type providers are awesome

For example, consider the csv format, that is quite common in business interfaces.
csv-icon

Well, F# provider has all the options you may need to load it.


type PL = 
    CsvProvider< 
        Sample = sample,
        Separators = ";", SkipRows = 1, HasHeaders = false, 
        Schema ="CargoID(string),CurrencyType(string),Currency(string),Amount(decimal)">

Now, we borrow a function from Reed Copsey

let tryToInt s = 
    match System.Int32.TryParse s with
    | true, v -> Some v
    | false, _ -> None

and then we can invent a cool, original, functional way to choose only the lines with a parsable int in the field for cargo id. We are indeed unleashing the power of F# Type Providers when we pass the safe-typed row in the signature of the following.

let withCargoID (rows:PL.Row[]) =
    rows
    |> Array.choose(fun r -> 
        tryToInt r.CargoID
        |> Option.bind (fun _ -> Some r)
    )

and the rest is an easy exercise of grouping

type CurrencyGroup = {CurrencyType: string; Currency: string}

let parse (filename:string) = 
    let pl = PL.Load filename
    let rows = pl.Rows |> Seq.toArray
    let header = rows |> Array.head
    printfn "rows #: %d with cargoes %d" rows.Length (withCargoID rows).Length
    printfn "first line => CargoID: %s, CurrencyType: %s, Currency: %s, Amount: %f!" header.CargoID header.CurrencyType header.Currency header.Amount
    let currGroups = 
        rows
        |> Array.groupBy( fun r ->
            {Currency = r.Currency; CurrencyType = r.CurrencyType}
        )
    currGroups
    |> Array.iter( fun (g, r) ->
        printfn "Currency %s CurrencyType %s #: %d with cargoes: %d" g.Currency g.CurrencyType r.Length (withCargoID r).Length
    )
    currGroups
    |> Array.sumBy(fun (g, r) -> r.Length)
    |> printfn "total records in all currencies %d" 
    currGroups
    |> Array.sumBy(fun (g, r) -> (withCargoID r).Length)
    |> printfn "total with cargoes in all currencies %d" 
    printfn "parsed fine!"

How to await an async callback

I’m trying to await a callback that is fired when a button is pressed.

maxresdefault

The important point is that I want to wait for the callback from a simple await without reshaping the code, that could be a loop of batches which require a user confirmation at each step.

We can also add a cancellation option if the user closes the dialog.

BatchProgressVM progressVM = new BatchProgressVM();
BatchProgress batchProgress = new BatchProgress();
progressVM.ProgressValue = 0;
progressVM.Log = msg => this.Status = msg;
progressVM.ProgressMessage = "Click Next to continue";
batchProgress.DataContext = progressVM;
batchProgress.Show();
CancellationTokenSource cts = new CancellationTokenSource();
batchProgress.Closed += OnClose;
Dispatcher dispatcher = Dispatcher.CurrentDispatcher;
void OnClose(object sender, EventArgs e)
{
    cts.Cancel();
}

In other words I want to achieve the following:

// e.g. it can be used in a loop
internal async Task BatchLogic(int num, int tot, CancellationTokenSource cts, Dispatcher dispatcher)
{
    taskNum = num;
    totTask = tot;
    dispatcher.Invoke(() =>
       {
           ProgressMessage = $"task {taskNum-1} of {totTask} done! Click Next!";
           IsBusy = false; // enable Next button for user to confirm going on
       });
    cts.Token.Register(() =>
    {
         if (callback != null) callback.Invoke(Task.FromResult(false));
    });
    // e.g. await that the user confirms
    await Task.Factory.FromAsync(beginMethod, endMethod, null);
    ProgressValue = 100 * taskNum / totTask;
}

with the following defintions

// define a closure for the callback
AsyncCallback callback; 
//pass the callback to the closure from the beginMethod
private IAsyncResult beginMethod(AsyncCallback callback, object state)
{
    this.callback = callback;
    return Task.FromResult(true);
}

private void endMethod(IAsyncResult obj)
{
    Task taskRes = obj as Task;
    // check if it is completed or cancelled
    bool res = taskRes?.Result ?? true;
    if (res)
    {
        ProgressMessage = $"task {taskNum} of {totTask} working! Pls wait";
        IsBusy = true; // disable Next button while working
    }
}

//and finally invoke it from the event (i.e. the Command method in the MVVM for WPF)
private async void RunNext()
{
    if (callback != null)
    {
        callback.Invoke(Task.FromResult(true));
    }
}

As I said, the usage is very convenient inside a loop

await progressVM.BatchLogic(batch_num, tot_batch, cts, dispatcher);
if (cts.IsCancellationRequested) break;

In the end (e.g. after the async loop) you can show the results and unsuscribe the Close event.

Status = (cts.IsCancellationRequested ? "Cancelled!" : "Done!");
batchProgress.Closed -= OnClose;
batchProgress.Close();

Autogenerating columns: SQLProvider and WebSharper.

A real world example

This is a short follow up to my previous Modern CSS3 article
We want to see how to autogenerate the table columns from the DB to the Web in a few lines with type checks all the way through.

The DB

I’ve recently discovered the cool bang-bang operator for the left join in SQLProvider.

let context = Sql.GetDataContext(ConfigurationManager.ConnectionStrings.["AppDB"].ConnectionString)

let tradesQuery = 
    query {
        for t in context.OilPhysical.EndurTradeValid do  
        join n_del in (!!) context.OilPhysical.EndurNominationValid on
                 ((t.DealNumber,t.ParcelGroupId,t.ParcelId) 
                 = (n_del.DeliveryDealNumber,n_del.DeliveryParcelGroup,n_del.DeliveryParcelId))
        join n_rec in (!!) context.OilPhysical.EndurNominationValid on
                 ((t.DealNumber,t.ParcelGroupId,t.ParcelId) 
                 = (n_rec.ReceiptDealNumber,n_rec.ReceiptParcelGroup,n_rec.ReceiptParcelId))
        take 90
        select (t,n_del.CargoId, n_rec.CargoId)  
        }

let trades = 
    tradesQuery |> Seq.toArray 
    |> Array.map(fun (t,n_del,n_rec) -> 
        t.ColumnValues
        |> Seq.append ([("Del Cargo ID", n_del :> obj)] |> Seq.append [("Rec Cargo ID", n_rec :> obj)]  ))

The Web

While for desktop app you would need property bindings and plain old classes, the modern type provider types are a nice fit for the awesome WebSharper dsl.
The conclusion is that you can easily configure an agile web page with all the fancy features you need.

        if Array.isEmpty ServerModel.trades then
            p [] [text "No trades selected"]
        else
            Doc.Concat [
            div [] [client <@ Client.Inspector() @>];
            table [attr.``class`` "table"; attr.``data-`` "role" "table"; attr.``data-`` "horizontal-scroll" "true"; attr.id "demo-table"] [
                
                yield thead [] [
                    tr [] [
                        for (k,v) in ServerModel.trades.[0] do
                            match v with
                            |  %3A? int
                            |  %3A? decimal -> yield th [attr.``class`` "sortable-column"; attr.``data-`` "format" "number"] [text k]
                            | _ ->  yield th [attr.``class`` "sortable-column"] [text k]
                    ] ]
                yield tbody [] [
                for tup in ServerModel.trades do
                    yield tr [] [
                        for (k,v) in tup do
                            yield td [] [text <| string v] ]
                    ]