IDL Spring 2024: Lecture 15

Поделиться
HTML-код
  • Опубликовано: 20 окт 2024
  • Output-at-end model
    ----The "output-at-end" model produces an output after the entire input sequence is analyzed
    ----The output is produced at the final output
    --------E.g. isolated word recognition, question answering, or sentiment analysis
    --------In fact, there are outputs at each input, but we ignore all but the final output
    ----To train the output-at-end model, we must consider the divergence at the output, i.e. at the final input
    --------However, this ignores the intermediate (ignored) outputs also produced by the network
    ---- A more profitable approach to training is to assume that the intermediate outputs too must match the target output, and to minimize the total divergence over all inputs.
    ----This effectively converts the output-at-end model to a time-synchronous model
    ----The order-synchronous intermittent output model
    ----The order-synchronous intermittent-output model produces outputs intermittently
    ----E.g. in speech recognition, where input may be hundreds or thousands of spectral vectors, but the output is only a much small sequence of words or phonemes
    ----Each output symbol corresponds to a segment of the input. The output symbols occur in the same order as the input segments they correspond to, but the length of the segment associated with any input varies from segment to segment.
    ----The order-synchronous, intermittent-output model is just a concatenation of several output-at-end models
    ----But this simple extension creates complications
    ----Given an input sequence, at inference time it is unknown how many outputs must be produced and at what times
    ----Inference must select the most likely "compressed" order-synchronous sequence (sequence of symbols of any length less than or equal to the length of the input sequence), given the input
    ----This is not the same as selecting the most likely time-synchronous output that has one output for every input
    ----The greedy algorithm for inference simply selects the most probable output at each time and merges repetitions
    ----The actual output is produced at the sequence of repeating outputs
    ----This cannot, however, distinguish between actual repetitions of output and a simple sequence of repeating outputs that must be merged
    ----This also actually outputs the most probable time-synchronous sequence, and not the most probable order-synchronous sequence.
    ----Training the intermittent-output model is simple if the alignment of the target outputs is known for the training inputs
    ----The "alignment" of the output to the input explicitly states at which times of the input each output symbol is produced
    ----We can now just sum the divergences at the individual aligned output times
    ----Alternately we can replicate the target output symbol for each aligned output over the entire segment of inputs it represents
    ----This converts it to a time-synchronous model
    ----The divergence is now defined as the sum of local divergences between the actual and target outputs at every time
    ----In reality, for intermittent-output problems such as speech recognition, the training data does not usually include the alignment of the output sequence to the input
    ----Only provides input sequence and target output sequence
    ----Possible solutions:
    ----Guess the alignments of training instances
    ----Consider all alignments
    Defining the alignment
    ----The "alignment" of an order-synchronous sequence to an input sequence indicates the specific times at which the symbols in the output sequence are produced
    ----This can also be represented by repeating the symbols in the order-synchronous sequences over the segment of input it corresponds to
    ----This converts the order-synchronous sequence to a time-synchronous sequence
    ----The time-synchronous sequence obtained by repeating symbols in the order-synchronous sequence is an "expansion" of the order-synchronous sequence
    ----The order synchronous sequence can be recovered from the time-synchronous sequence by eliminating repetitions
    ----This is a "compression" of the time-synchronous sequence
    ----An order-synchronous sequence can be expanded into a time-synchronous sequence in many ways
    ----A time-synchronous can be compressed into an order-synchronous sequence in many ways

Комментарии • 3

  • @cerealpeer
    @cerealpeer 7 месяцев назад

    oh wow

    • @cerealpeer
      @cerealpeer 7 месяцев назад

      worker app for open assistant plz

    • @cerealpeer
      @cerealpeer 7 месяцев назад

      use recursive retention