Your neural network is probably not a tensor

Поделиться
HTML-код
  • Опубликовано: 12 авг 2021
  • The word tensor suggests a connection to the powerful tools of linear algebra and other mathematics. Yet definitions of tensors seems on the whole rather basic. How can something basic come with such powerful tools? The answer lies in clarifying what is, and is not a tensor.
  • НаукаНаука

Комментарии • 23

  • @BlackRose4MyDeath
    @BlackRose4MyDeath 2 года назад +3

    Great video. This definitely points out the subtlety in how linear algebra can and cannot be used as a tool in data science and/or machine learning.

  • @adamburry
    @adamburry 2 года назад +7

    I enjoyed this video, very interesting and thoughtful.
    A similar example that came to mind as I was watching is ID numbers. Phone numbers, credit card numbers, policy numbers, etc. are not really numbers; they are names with a digit alphabet. You cannot meaningfully add two credit card numbers together, for example. But ID numbers are modeled in software as integers rather than arrays of characters (or digits) all the time for easy storage efficiency.

    • @gerben880
      @gerben880 Год назад +1

      interesting. i never thought about it like that, but what you're saying makes complete sense!

  • @quadmasterXLII
    @quadmasterXLII 2 года назад +12

    While the multidimensional arrays that I operate on in machine learning tasks don't transform like tensors, I do end up using einstein notation all the time to tell the computer how to multiply / combine / reduce them. I suspect this extremely useful notation is why the name stuck.

    • @a_name_a
      @a_name_a 2 года назад +4

      Nah, it’s cuz it sounded cool to computer scientists

  • @AdobadoFantastico
    @AdobadoFantastico 2 года назад +3

    Thanks these are some very interesting points to consider.

  • @garyantonyo
    @garyantonyo Год назад

    love the last slide

  • @AamirSiddiquiCR7
    @AamirSiddiquiCR7 2 года назад +1

    Please continue making more videos like this

  • @judgeomega
    @judgeomega 2 года назад

    razor sharp.

  • @platinumpig
    @platinumpig 2 года назад

    Can one use an adjacency matrix of a wheatsone bridge cicuit to calculate the resistance between any two points in that circuit ?

  • @ruroruro
    @ruroruro 2 года назад +16

    Nobody can own a word. The word "tensor" used to mean something specific in some branches of maths and physics.
    As time passed, this word got adopted into computer science and into machine learning and in that context it slightly changed it's meaning. ML Tensors, Physics Tensors and Math Tensors still have a lot of similarities, but they are in fact different objects.
    In machine learning libraries, a tensor is just a N-dimensional array of numbers. That's it.
    You say that an image is not a tensor "in the real sense", but who gets to define what "real sense" is?
    This situation is actually quite common. Mathematicians, Physicsts and Programmers alike are kind of bad at naming things. We tend to reuse and mutate our definitions and this leads to confusing situations where the definitions get mixed up.

    • @algeboy
      @algeboy  2 года назад +11

      Thx 100% agree. As I say at the end, if context suggests "Tensor" means XYZ go with it, and that words like "formal tensor" can be used when needing to be precise about the math. But it is still important to be aware that when you search the web you are bound to find "tensor" as a primarily math object (e.g. Wikipedia), it has a 150 year head start on its use elsewhere. The same is true of "Vector" which has a similar long standing math meaning but which in CS is increasingly just a data type for random access lists.
      Little history here: Sir William Hamilton invented both "Vector" and "Tensor". His "tensor" just meant the length (norm) of a quaternion. Very very different from the hypermatrix/multi-array meaning of many today. So skipping ahead it is bound to happen that the names will continue to mutate.

    • @alexisandersen1392
      @alexisandersen1392 2 года назад +2

      They're only different in so far as they're applied. If it's still an object subject to tensor algebra, it's a tensor... or at least you can describe it with a tensor, and its behavior can be modeled with a tensor algebra.
      At the end of the day all mathematical objects are essentially imaginary things, completely arbitrary mental models that we use to think more clearly about the subject matter that we find their application suitable.

  • @alexisandersen1392
    @alexisandersen1392 2 года назад +9

    A matrix IS a tensor.... or rather, tensor doesn't mean it's necessarily the higher level types of tensors you are likely referring to. A tensor is a generalization that encompasses scalars, vectors, matrices, dual vectors, multi linear maps, multi dimensional arrays, and so forth. They're all tensors not because of their structure, but because they are objects subject to tensor algebras. Tensor algebra is a generalization of other algebras including linear algebra, so your qualms are a semantic argument from ignorance.
    A neural network layer can be thought of as a tensor, provided that it's specific implementation reproduces a tensor operation.... neural networks are generalizations that can include things which are and are not tensors. Generally speaking, when a neural network library uses a TPU the TPU is using tensors, and in the interpretation of the output, a decision is made to approximate a discontinuous function, but in reality, the output of neural networks are rarely so discrete, and the raw output of a neural network are a values in a continuous range...

    • @algeboy
      @algeboy  2 года назад +13

      I like to ask my students: Is a list of phone numbers a vector? Sure it is a list of numbers but would we learn something by rescaling this list or adding it to another? The point of this video is similar: to remind us that a data structure that looks like another isn't reason enough to import the entire theory of such structures. A priori an image is not helped by rescaling an individual row of pixels so in this sense a the matrix that holds an image is not so much a tensor as it is a convenient container for the data. But if you want a low rank approximation then suddenly an image as a matrix really does come into play. The point is these are not a free-lunch but rather any linear interpretation can require considerable subtlety to uncover. Naive storage of data as a grid does not on the whole rise to the level of tensor based methods.
      TPUs and GPUs do a great amount of truly linear work: tensor contractions such as the dot-products needed for convolution and running data through a neural net. And yet, an average neural network with non-linear combiners between layers breaks apart the linear aspect, so it should be carefully understood that some tensor aspects in NN are more to hold data in a tensor like structure but not being a tensor in a fully fledged way.
      But above all I hope I did not give the impression that tensors had to be 3-valent! Vectors and matrices can be great examples of tensors too, and there are genuine tensorial aspects of neural networks, and of images, and graphs, if we put the time in to properly interpret the information. As someone who studies tensor algebra for a living I'm delighted to see new and impressive uses, but not everything makes the cut.

    • @alexisandersen1392
      @alexisandersen1392 2 года назад +4

      @@algeboy Great response. Pleasure to read. Totally agree.

    • @lorimillim9131
      @lorimillim9131 2 года назад +1

      Much appreciated Insights

    • @rajinfootonchuriquen
      @rajinfootonchuriquen Год назад

      multidimensional array are not tensors. They don't represent vector spaces. If you think of your layers as tensors, you woukd think that you are changing the basis of the space for each epoc, and that doesn't make sense.

  • @diego1694
    @diego1694 2 года назад +1

    I think this argument would hold a lot more weight if mathematicians were actually consistent in their naming conventions. Far from it, each branch of mathematics has its own vocabulary, with many terms shared between them but with very different meanings in each instance. So, why should computer science be different? Why should a particular field of mathematics reserve the qualifier "formal" for themselves, as if implying that other branches are any less formal? In the end, it is just words. For the better or worse, the word tensor to refer to N-dimensional arrays has already been widely adopted in the field, and no amount of rambling about it is going to change that.

    • @algeboy
      @algeboy  2 года назад +2

      The greater point: by any words we should take care as scientist to explain what transformation are valid for our data. By all means any subject should feel free to define what works for them. Tensor today stands a pretty good chance to be confusing, so at least document whatever intention is needed. (And sorry if I implied "formal" meant math, obviously math isn't the only topic that can become overly fussy about things.)

    • @diego1694
      @diego1694 2 года назад

      ​@@algeboy I don't disagree that it can be confusing, and in fact I will admit that when I was starting in machine learning I was very confused the first time I saw the term tensor (and also relieved once I realized that they were just talking about multidimensional arrays). In the same line, C++ calls dynamic arrays vectors, while providing very little tools to manipulate them with linear algebra. In both cases, the concepts are similar enough in structure to their mathematical counterparts that it doesn't take very long to clear the confusion.

  • @bernardofitzpatrick5403
    @bernardofitzpatrick5403 2 года назад

    Subscribed 🤙🏽

  • @zerotwo7319
    @zerotwo7319 Год назад

    Sad speed noises. I just want to run jarvis on my cpu.