Solving the right problem
Wed Nov 6 10:49:22 CET 2019
Thank you Ulf,
I’m hesitant to label what I’m going for here with loaded terms such as SOA architecture or “micro-SOA”. I’ve found in the many years I’ve worked in telco’s as a Systems Architect or Design Authority that we do better when we don’t overplay architectural choices as if they are goals in themselves or even as reusable solutions. Yes, for trivial problems with trivial solutions where the cost of a bespoke architecture would never be warranted it makes sense to squeeze the solution into an architecture that there or there about and deal with the little things that gets in the way as they come up. Been there, done that too.
When, by contrast, the project’s value proposition is such that the implications on cost, TTM or otherwise, of a custom architecture is warranted by the benefits to viability, sustainability and profitability, then using a “best fit” ready-made architecture tends to lead not to minor issues to overcome, but impedance mismatch level issues to contend with when one can least afford those – while scaling and rolling out.
I’m hoping to leverage my extremely narrow problem domain to arrive at the simplest possible design rules and principles which will achieve the time to market, performance and scalability I’m aiming for. If the result turns out to be applicable to more people’s problem domains than my own that would be cool, but I’m obliged to prioritise my own project above all else.
I’m sorry about the vagueness in my question. I will endeavor to clarify more as questions like yours come up. The project I’m working aims to have massive impact from a simple design. At this juncture I should describe it as a giant global key-value store where the keys have many dimensions including space and time and payload which, if actually stored in a key-value store, would usually be a fraction of the size of the composite key. Therefore, and because the data is intrinsically relational (i.e. everything is related to everything either directly or indirectly, but yes, Relational as in the 5th Normal Form too), these intrinsic properties makes true key-value store often sub-optimal.
The idea is to have the client request data based on a cursor or user/session specific context. Each client would maintain a logical connection with its closest (network wise) server where it makes the requests. The request is load-balanced at that server address to a host with available capacity and knowledge of the context. At that host, the request gets interpreted/processed by combining a) the context in which the request was made, such as 3D viewpoint and zoom level, b) the request itself, i.e. what data is being requested, c) how and where the raw or derived data can best be obtained from. This I’d hope to determine both recursively and without sequential code bottlenecks by splitting up the request and context matching across as many nodes as are required and available. The same then goes to how the request is fulfilled. By the client including data in the request about the user’s apparent momentum the matching process is also capable of anticipating what data and/or results might be requested next. Up to 100% of available resources may be spent on such pre-fetching, but it may not hold back the current request nor may it demand resources required for other priority requests. Simply put, every core in every data center and every provisioned network link should run at 100% all the time doing the most opportunistic work possible to provide each user fair and maximal use of the resources.
To implement that simply and in a manner that can evolve naturally, I was hoping to find the ideal place in code terms for a function which takes a “parsed” request and cursor data structure as parameters and returns a result structure which will be filled in in due course (not a completed result). The returned result would be handed back to an appointed client handler which would compile and send results back to the client as and when it comes in from wherever it had to be fetched. The client itself would be aware of the fact that not all data is present yet when it gets results, and would set up to be “subscribed” to data arriving later.
I know this is still very vague and I am sorry about that. But maybe there is enough in there to kick off the next level discussion about what I asked in the first place, which is to learn about past and/or present experiences using erlang to solve similar problems and what worked out well vs what didn’t.
From: Ulf Wiger <ulf@REDACTED>
Date: Tuesday, 5 November, 2019 at 17:03
To: "marthin@REDACTED" <marthin@REDACTED>
Cc: erlang-questions <erlang-questions@REDACTED>
Subject: Re: Solving the right problem
Re. 3, you should definitely look into using existing solutions for HTTP/HTTPS load-balancing. This will work every bit as well with Erlang/inets as with any other technology.
Re. 4, yes, and you're not limited to inets. Take a look e.g. at Cowboy 
Re. 5, well, your description is vague enough that it's hard to answer, but you seem to be aiming for some form of SOA architecture. If you want to proceed quickly with prototyping and MVPs, you could implement a component architecture inside a single Erlang node and make some minimal preparations for being able to later break them apart into a larger network of services. A single Erlang node running on a decent cloud instance is likely to handle a fairly large number of devices without breaking a sweat, unless your applications are expected to be very computationally heavy.
This way, you can defer the many messy issues of going full SOA from the beginning, and benefit from Erlang's outstanding "micro-SOA" capabilities.
Den mån 4 nov. 2019 kl 13:10 skrev Marthin Laubscher <marthin@REDACTED>:
Please pardon my long absences from this awesome (mature) community and my limited knowledge which is likely outdated as well. I’ve known since 1996 when I was first told (in confidence by an Ericsson Radio Systems liaison) about Erlang that it would have to play a role when I eventually get to implementing the system I’ve been working on designing since 1991. That big day is drawing near, so now I’d like to reaffirm my high level understanding of what the language is suited for.
I reckon the problem I’m looking to address is intrinsically concurrent and if I can design the solution just right I just might be able to avoid introducing sequential code and choke points to create an dynamic (learning, responsive to conditions) distributed server capable of using all or most of its resources for any mixture of trivial, simple, complex and massive service requests whether it’s coming from a few clients or billions of them. Essentially as illustrated in the diagram below:
I’d like to ask your advice and check some assumptions if I may impose.
Is my conviction that Erlang (and OTP) is ideally if not best suited to addressing this type of problem objectively justified or likely based on loyalty and familiarity?
Is my aspiration to scale the same code from one server to potentially hundreds of thousands of cores spread across the globe hopelessly romantic or somewhere within the realm of possibility?
Assuming the network remains HTTP/HTTPS based, would Erlang’s inets module allow the code accepting new requests to interact with and control load balancing hardware to ensure each such request is served on the best available server, or will I need custom load balancers of my own for that?
Still assuming HTTP/HTTPS will inets allow me to break up request processing across multiple processes and threads based on incremental scanning of the requests themselves?
Are there lessons from previous (or current) successes and/or failures to achieve similar results to learn from available in the public domain like maybe from ejabberd or Yaws? (I’m not attempting to reinvent any wheels or address a general purpose need like Yaws et al. Internet and web protocols may be involved but I have a singular focus on delivering user-specific perspectives of a single large dataset to a custom client app.)
Given my two additional objectives of eventually:-
ending up with a boringly simple system which elegantly encapsulates all these complex requirements, and
open-sourcing the entire system once it’s beyond reach of those with nefarious intentions,
would anybody like to get involved in helping design and implement this project or even take the lead on it?
Thank you in advance for your kind consideration.
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 64025 bytes
Desc: not available
More information about the erlang-questions