Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sean Parent has a good talk about no loops in C++. His primary argument is that typically a loop is an algorithm you’re applying (map, filter, reduce…) and the raw loop masks the algorithm.

https://m.youtube.com/watch?v=qH6sSOr-yk8



While I agree that a lot of loops could be better implemented as a map, filter or similar construct, there's still many loops where writing it as a loop makes it more clear what's going on.

For example, in our system we have orders. Each order has some order items as well as one or more invoice.

Due to reasons, some customers want to consolidate orders before processing them in our system. In that case all the items should simply be copied to the consolidated order, however for invoices we should accumulate values for the same invoice number and currency combo.

In addition, order items references the invoice they belong to, so we need to keep track of the new invoice id's so we can remap that reference.

Doing all this in a few nested loops makes the overall process very clear I think, each step in the loop being clear and logical. In that case, the loops highlight the algorithm I think.

I'm not sure how to implement the consolidation only in terms of map, filter and friends in a way which would be more clear.


> there's still many loops where writing it as a loop makes it more clear what's going on

I think that really depends on your language and what you are used to. For me, I would disagree.

> I'm not sure how to implement the consolidation only in terms of map, filter and friends in a way which would be more clear.

Tbh, when I read this, I'm thinking: is the data-structure wrong? Maybe you should have something like a "bundle" that contains orderItems and their invoice. It doesn't seem to make so much sense to keep an invoice ID in each orderItem. That makes sense in a relational db, but not when using objects.

I would model it like that (for lack of a better name than "bundle"):

    Order(bundles: Map[InvoiceId, Bundle(invoice: Invoice, orderItems: List[OrderItem]))
Now orderItems and the invoice-amount are logically bundled together and identified by the invoice-id. Then, combining two orders becomes really simple: combinedOrder = order1 + order2

No loops needed, simply treating the data as monoids.

Here's an executable example: https://scastie.scala-lang.org/DDIRbiC5TTuOxmkzEFxTqA

You might argue that it is not clear how that works and you are right for everyone who is not used to how monoids work. But everyone who is, independent of the language, will understand exactly what happens. On the other hand, someone who is not used to loops would probably argue that your loops are hard to understand.

It really boils down to what techniques one is familiar with.


I think I get your point, however the real world is muddy...

> I would model it like that

How does your model take into account orders where order lines do not map to an invoice, or which has invoices which no order lines references?

edit: interesting example btw, I do like a lot of aspects and it was fun seeing some "practical" Scala, haven't had a chance to dig in much. Much appreciated!


> How does your model take into account orders where order lines do not map to an invoice, or which has invoices which no order lines references?

If I understand you right, the latter seems easy: the list of orderItems would just be empty.

If there can be orderItems without invoice (why would that happen?) then I'd just add another field "plainOrderItems: List[OrderItem]" next to the bundles.

The rest of the code would not change.

That being said, there can surely be situations where modeling the data is not trivial. It's always exciting to model it in the best way to describe the problem at hand. And with this approach, sometimes you need to model different "views" to make combining things work.

> edit: interesting example btw, I do like a lot of aspects and it was fun seeing some "practical" Scala, haven't had a chance to dig in much. Much appreciated!

You are welcome :)


> If there can be orderItems without invoice (why would that happen?)

Because people mess up... In this case they'd then fix the missing invoice references post-consolidation.

Stuff like this usually happens either due to stress, lack of experience or training (summer temps are a menace) or just negligence.

In addition to our consolidation procedure, I've naturally had to write a procedure to undo the consolidation.

> then I'd just add another field "plainOrderItems: List[OrderItem]" next to the bundles.

That could work, would be more tedious when you want to operate over all items (we do this fairly often), but not terribly so.


Okay, interesting.

> That could work, would be more tedious when you want to operate over all items (we do this fairly often), but not terribly so.

Hm, not really. I've updated the code with a convenience method to see how it feels and I don't think it impacts the ergonomics at all:

https://scastie.scala-lang.org/QG1VYuFBQ4qRwfJSjq7bjA


Obviously I can’t see the business logic but I would express this (or what I believe I understand of it) as a fold that accumulates a map of structures which organize the invoice(s) alongside the orders somehow. I suspect that would result in a kind of “inversion” of responsibility that might be unfamiliar at first, but may well result in a clearer relationship between the data elements to someone new to the code (and may make refactoring a lot easier).


Can you post a (simplified) sketch of what your looping solution looks like? I'd like to give it a shot with map/filter/etc so we can compare.


It's very close to this, in C#-ish pseudocode, and barring errors introduced by my fried Friday brain. The duplicate invoice lookup isn't a linear search but I wanted to keep it simple. In reality order items have various sub-items, but lets ignore those.

It's not super elegant, but in my mind it's straight forward and should be easy for my colleagues to jump in and understand. Though I'd be delighted to be proved wrong.

    var dstOrder = new Order();
    var invoiceMap = new Dictionary<Invoice,Invoice>();
    
    for srcOrder in SrcOrders do
    {
      for srcInvoice in srcOrder.Invoices do
      {        
        // compares using invoice number and currency
        var dstInvoice = dstOrder.Invoices.Find(srcInvoice);
    
        if (dstInvoice == null) {
          dstInvoice = dstOrder.AppendInvoice();
          dstInvoice.CopyFrom(srcInvoice);
          invoiceMap[srcInvoice] = dstInvoice;
        }
        else 
        {
          dstInvoice.Amount += srcInvoice.Amount;
        }
      }
    
      for srcItem in srcOrder.Items do
      {
        var dstItem = dstOrder.AppendItem();
        dstItem.CopyFrom(srcItem);
        // map old invoice reference to new (possibly consolidated) invoice
        dstItem.Invoice = invoiceMap[dstItem.Invoice];
      }
    }
    
    dstOrder.Save();

    return dstOrder;


some of this is a little unclear. in particular how the source invoices are identified and mapped and therefore what invoiceMap really looks like. but here is a group/sum in a madeup datalog dialect:

reconcile(dstId, amount) :-

  Orders(srcOrder)

  Invoices(srcInvoice, srcOrder)

  unique(dstId, srcInvoice.id)

  amount = {
     sourceInvoice.id = dstId,
     sum (srcInvoice.Amount)
  }
that produces invoice/sum pairs. Orders and Invoices are your source relations. unique produces a set of unique source invoice ids, which drives the cardinality of the aggregate. the block notation is a scoping construct encapsulate the set cardinality changes.

not going to claim that this is inherently more readable or captures all the subtleties. but you can imagine that terse expressions of intent like this are more accessible to the reader and less error prone. if not, I should do a better job.


Sorry, I was in a bit of a hurry. An Invoice has essentially these fields:

    id
    InvoiceNo
    InvoiceDate
    Amount
    Currency
The id is the unique id in the DB. For the purposes of consolidation we consider (InvoiceNo, Currency) as a tuple. For brevity I didn't include the failure mode of mismatching invoice dates, that can be checked in advance anyway so not terribly relevant.

The invoiceMap maps instance to instance, or the unique id to unique id if you like.


I guess my answer depends on what kind of DB this really is.

If the app is running off a SQL Backend, you could probably do this with some well thought out queries, however whether doing so in an ORM/ActiveRecord Environment (which is what this looks like) will be beneficial might be another matter.

For instance, if this was SQL and a Micro-ORM was involved, I'd instead try to grab all the data in one pass (With the right ones that's fairly simple,) Calculate my Insert set from that, and write out the new records. In that case though, there'd probably still be some form of LINQ/Looping, both on the level of business code, as well as under the wire. It would be more performant for sure, but to your point, IDK whether it would be more understandable




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: